Abstract

Large language models often utilize external tools through programming interfaces to solve complex tasks. While small language models can be executed on local devices, the associated tools typically reside on remote servers, which prevents offline functionality and introduces latency. A method is disclosed for porting online tools to local implementations by grounding the porting process in specific user usage patterns. Actual tool usage is monitored to record a distribution of inputs and outputs. Based on this recorded distribution, simplified tool variants are synthesized using techniques such as zero-shot prediction, few-shot learning, or the generation of local executable code. These alternative tools are validated against the original tool outputs and user-defined constraints. The disclosure enables offline tool accessibility, reduces inference costs, and improves perceived latency by deploying simplified, task-specific tool versions directly on a local device.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS