The baseline capabilities of LLMs are not always optimally suited to respond to queries within organizational contexts that involve internal, proprietary content with domain-specific language and jargon. While retrieval-augmented generation (RAG) can enable LLMs to deliver responses that take into account proprietary content of an organization as query context, there is no well-defined approach to deploy RAG while safeguarding content. This disclosure describes techniques that enhance RAG to enable any user affiliated with an organization to perform semantic queries limited to the context of the documents the user has the rights to access. Permissions-aware RAG is implemented by retrieving the set of all users and user groups within the organization permitted to access each piece of internal content. A vector database stores content embeddings and the identities of principals that have access to respective pieces of content. When a user provides a query, documents are retrieved from the vector database by matching user credentials with stored principals. The retrieved document subset is provided to an LLM and serves as the permissions-restricted context to the query. The context enables the LLM to generate responses taking into account the organizational context while preventing inadvertent leakage of information.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.