Abstract
The present disclosure provides a method and a system for automatically selecting context-aware large language models (LLMs) using a reward-based multi-objective learning mechanism. The method includes receiving an input task and extracting a corresponding context vector comprising one or more contextual attributes. The method includes computing, for each of a plurality of LLMs, a confidence-adjusted expected performance score using the context vector, the score being generated using a contextual learning algorithm. The method includes selecting at least one of the plurality of LLMs having the highest confidence-adjusted expected reward for the input task. The method includes executing the selected LLMs to generate one or more output responses for the input task. The method includes calculating a reward value associated with the one or more output responses. The method includes updating one or more parameters of the contextual learning algorithm based on the calculated reward value.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
KUMAR, SUMIT; MOHAN, SHIVAM; SINHA, RAVI SHANKER KUMAR; DHAMIJA, DHEERAJ; and MISHRA, SOUMENDRA KUMAR, "SYSTEM AND METHOD FOR AUTOMATICALLY SELECTING CONTEXT-AWARE LLMS USING A REWARD-BASED MULTI-OBJECTIVE LEARNING MECHANISM", Technical Disclosure Commons, (December 17, 2025)
https://www.tdcommons.org/dpubs_series/9049