Defensive Publications Series

Automatic Tool Selection to Reduce Large Language Model Latency

Derik Clive RobertFollow
Ankur GargFollow
Jitesh Vinod PunjabiFollow
Pavan Kumar Reddy MFollow
Nihal Sandeep BalaniFollow
Rushin ShahFollow

Abstract

Large language models (LLMs) can interact with external tools to retrieve real-time information, make reservations, etc. While complete tool specifications for the external tools can be included in the prompt of an LLM, doing so increases context size and latency. This disclosure describes techniques that use small tool-selector models alongside large language models to select tools that are appropriate to the task specified in the prompt. The techniques preserve prompt space by ensuring that the LLM accesses the full specifications of a relevant tool only when necessary. The techniques enable an LLM to handle complex requests, such as multi-step tasks or interactions with external services, while maintaining high responsiveness and reducing the computational overhead associated with processing extensive tool documentation.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Robert, Derik Clive; Garg, Ankur; Punjabi, Jitesh Vinod; Reddy M, Pavan Kumar; Balani, Nihal Sandeep; and Shah, Rushin, "Automatic Tool Selection to Reduce Large Language Model Latency", Technical Disclosure Commons, (November 12, 2024)
https://www.tdcommons.org/dpubs_series/7521

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automatic Tool Selection to Reduce Large Language Model Latency

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automatic Tool Selection to Reduce Large Language Model Latency

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information