Inventor(s)

NAFollow

Abstract

This disclosure describes techniques that perform a two-phase process using a large language model (LLM) to extract precise, verbatim text citations from documents that are relevant to a user query. In the first phase, high-level analysis of a document is performed in relation to the query using an LLM to identify likely relevant sections and generate a corresponding analysis of the document. In the second phase, the document, the query, and the output from the first phase are provided as inputs to an LLM that is instructed to extract verbatim citations. In this phase, the LLM is constrained by a tool schema to extract citations following a specific format. This ensures that the extracted citations are exact quotes from the source document and are in no way modified by the LLM. The techniques are particularly applicable in domains such as legal, medical, or insurance, where users that seek answers to queries require verbatim citations from source documents. The two-step process described herein extraction of verbatim citations using an LLM and tool calling can outperform other approaches, such as retrieval-augmented generation.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS