Device Information as Context Input for Custom Response Generation by a Virtual Assistant or Chatbot
Abstract
Chatbots and virtual assistants powered by a multimodal large language model (LLM) can respond to a wide range of queries and can process input in text, audio, video, or other formats. However, when deployed on a user device such as a smartphone, it is difficult to determine what data to capture and analyze, and the modality in which to provide the response. This disclosure describes the use of contextual signals such as device capability, device settings and state, connections with other devices, etc., obtained and used with user permission, to detect the input to be analyzed by the LLM and the format of the output. For example, an on-device chatbot can detect input such as screen sharing along with user commands, and use the screen content as context when generating a response.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Kaushansky, Karen; Sheeder, Tony; Gunaratne, Junius; Bloom, Jon; Pyatigorskiy, Ilya; Akash, Kinda; and Pham, Ryan, "Device Information as Context Input for Custom Response Generation by a Virtual Assistant or Chatbot", Technical Disclosure Commons, (March 17, 2025)
https://www.tdcommons.org/dpubs_series/7911