Devices that provide virtual assistants that respond to spoken queries monitor the user’s speech to detect a hotword to trigger the virtual assistant application. However, hotword detection requires some components of the device to be always on which make such devices power hungry. Always-on microphones are a problem especially for small devices such as hearables that have a small battery. This disclosure describes techniques that gate hotword detection with a lightweight neural network that estimates mouth motion signatures using signals from a low-power, always-on inertial measurement unit (IMU). Because the IMU has lower bandwidth than microphones, signals generated by the IMU can be processed by a neural network that is small enough to reside on the front-end processor of a low-powered device such as a hearable. Hotword gating can be done without waking up power hungry processors of the device. IMU-based gating can allow high-precision hotword detection to be achieved at very low power consumption.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Shin, D, "Low Latency and Energy-efficient Hotword Detection Based on Mouth Movement", Technical Disclosure Commons, (July 07, 2023)