Abstract
Existing hazard detection methods often depend on cloud processing, which can introduce latency and privacy concerns. These methods may also be limited to single data modalities, lacking comprehensive contextual understanding. This disclosure describes techniques for on-device, multi-modal hazard detection using a language model or a large language model (LLM). The method involves the continuous, real-time processing of raw data streams from various sensors, such as cameras and microphones. An on-device LLM analyzes this fused sensor data to reason about the user's environment and calculate a probability of danger. The purpose is to provide timely, context-aware safety notifications to the user while preserving data privacy by performing all processing locally on the device.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Labzovsky, Ilia and Karmon, Danny, "On-Device Multi-Modal Hazard Detection Using a Language Model", Technical Disclosure Commons, (September 01, 2025)
https://www.tdcommons.org/dpubs_series/8537