Abstract

Existing hazard detection methods often depend on cloud processing, which can introduce latency and privacy concerns. These methods may also be limited to single data modalities, lacking comprehensive contextual understanding. This disclosure describes techniques for on-device, multi-modal hazard detection using a language model or a large language model (LLM). The method involves the continuous, real-time processing of raw data streams from various sensors, such as cameras and microphones. An on-device LLM analyzes this fused sensor data to reason about the user's environment and calculate a probability of danger. The purpose is to provide timely, context-aware safety notifications to the user while preserving data privacy by performing all processing locally on the device.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS