Abstract

Techniques are presented herein provide for an intelligent caching mechanism of a Retrieval Augmented Generation (RAG) model trained with dynamic data which is subject to get deleted or modified periodically. Based on the feedback from the customer, Cache Vector DB is updated. This method is capable of processing positive and negative feedback from customers to update the Cache Vector DB while simultaneously protecting from negative feedback spam. The method leverages a framework that not only caches an original question/query (primary) and the corresponding response, but also secondary question/queries which either uses GenAI to create synthetic queries of the original question/query or the queries which get positive feedback of an answer fetched from the caching. Also, keeping the cache updated once the source document is modified is a key point of this process. This strategy increases the likelihood of hitting the Cache Vector DB decreasing the amount of processing time and reduces the cost to deliver an answer.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS