Abstract

Systems, methods, and computer program products are provided for mitigating generative model such as a large language model (LLM) hallucination. The system includes a processor configured to receive a query and input the query to a first generative model. The processor is also configured to determine candidate responses based on an output of the first generative model, generate embeddings based on the query, and retrieve data from an embedding-indexed data store based on the embeddings. The processor is further configured to input the data to a second generative model, generate a summary based on an output of the second generative model, and input the data and the candidate responses to a third generative model. The processor is further configured to produce filtered responses based on an output of the third generative model, generate a response based on the summary and the filtered responses, and transmit the response.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS