Abstract

Retrieval based augmentation (RAG) is a popular technique to ground responses generated by a large language model (LLM). The grounding is performed by referring online to a particular database as the responses are generated. However, naive RAG approaches may generate a list of embeddings that is too large, raising the computational cost of matching with user queries, or a list that is sparse, but only provides coarse context. This disclosure describes techniques for dynamic retrieval-augmented generation (RAG). Topical grouping is performed on embeddings obtained from databases that are used for grounding. Matching with the user query is performed using the cluster-representative embeddings that are not too numerous and provide adequate retrieval coverage. A trigger scheme is described that can execute in the background and perform regrouping when the databases are updated.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Shin, D, "Dynamic Topical Retrieval-Augmented Generation for Accurate and Fast Context Retrieval", Technical Disclosure Commons, (June 16, 2025)
https://www.tdcommons.org/dpubs_series/8235

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Dynamic Topical Retrieval-Augmented Generation for Accurate and Fast Context Retrieval

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Dynamic Topical Retrieval-Augmented Generation for Accurate and Fast Context Retrieval

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information