Inventor(s)

D ShinFollow

Abstract

Retrieval based augmentation (RAG) is a popular technique to ground responses generated by a large language model (LLM). The grounding is performed by referring online to a particular database as the responses are generated. However, naive RAG approaches may generate a list of embeddings that is too large, raising the computational cost of matching with user queries, or a list that is sparse, but only provides coarse context. This disclosure describes techniques for dynamic retrieval-augmented generation (RAG). Topical grouping is performed on embeddings obtained from databases that are used for grounding. Matching with the user query is performed using the cluster-representative embeddings that are not too numerous and provide adequate retrieval coverage. A trigger scheme is described that can execute in the background and perform regrouping when the databases are updated.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS