Abstract

Large language model inference context often becomes inefficient or contains irrelevant information as conversational threads grow in length. This disclosure describes methods for adjusting inference context based on user-initiated organizational signals, such as renaming threads or grouping them into projects. When a thread is renamed, the conversation is segmented into topics and re-ranked to prune segments that do not align with the new title. Conversely, when threads are organized into a project or given similar names, relevant context from related threads is dynamically loaded to augment the current session. User feedback signals, such as manual ratings or interface interactions, are further utilized to prioritize specific context segments. These techniques reduce computational overhead by minimizing token count and improve response quality by focusing the model on relevant conversational history.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS