Defensive Publications Series

MULTI-STAGE FINE-TUNING PROCESS FOR OPTIMIZING SMALL LLMS IN RAG APPLICATIONS

Nicholas Robert Davidson
Siva Kanth Sattiraju
Umesh Gangadharaiah

Abstract

Proposed herein is a Large Language Model (LLM) fine-tuning methodology, referred to as a Quantized Low-Rank Adaptation-Blend (QLoRA-Blend) methodology that enables small LLMs to outperform larger state-of-the-art models with minimal financial investment. By integrating multiple, domain-specific QLoRA adaptations using Spherical Linear Interpolation (SLERP), the QLoRA-Blend fine-tuning technique achieves superior accuracy and efficiency in Retrieval-Augmented Generation (RAG) systems.

This paper has been withdrawn.

Technical Disclosure Commons

Defensive Publications Series

MULTI-STAGE FINE-TUNING PROCESS FOR OPTIMIZING SMALL LLMS IN RAG APPLICATIONS

Abstract

Browse

Search

Submit

Additional Information