Abstract

Proposed herein is a Large Language Model (LLM) fine-tuning methodology, referred to as a Quantized Low-Rank Adaptation-Blend (QLoRA-Blend) methodology that enables small LLMs to outperform larger state-of-the-art models with minimal financial investment. By integrating multiple, domain-specific QLoRA adaptations using Spherical Linear Interpolation (SLERP), the QLoRA-Blend fine-tuning technique achieves superior accuracy and efficiency in Retrieval-Augmented Generation (RAG) systems.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Davidson, Nicholas Robert; Sattiraju, Siva Kanth; and Gangadharaiah, Umesh, "MULTI-STAGE FINE-TUNING PROCESS FOR OPTIMIZING SMALL LLMS IN RAG APPLICATIONS", Technical Disclosure Commons, (June 06, 2024)
https://www.tdcommons.org/dpubs_series/7085

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

MULTI-STAGE FINE-TUNING PROCESS FOR OPTIMIZING SMALL LLMS IN RAG APPLICATIONS

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

MULTI-STAGE FINE-TUNING PROCESS FOR OPTIMIZING SMALL LLMS IN RAG APPLICATIONS

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information