Inventor(s)

Dongeek ShinFollow

Abstract

This publication describes using signal-to-noise ratio (SNR) estimators for audio recordings to identify an optimal large language model (LLM) for analyzing audio transcriptions generated from the audio recordings. This publication may enable users of microphone-enabled devices (e.g., smartwatches, tablets, wearables, cellular devices, mobile phones, etc.) to obtain more accurate responses to audio queries by using SNR estimators to infer the accuracy or quality of the transcription. The device may use the SNR estimates to select an appropriate LLM to generate responses to the audio input (e.g., queries, prompts, commands, etc.). It may be more difficult for speech-to-text transcription engines to generate accurate transcriptions from audio that has a low SNR (i.e., more noise than signal). By selecting an LLM based at least in part on the SNR of the audio input, the computing device may use more sophisticated, but slower and more power intensive, LLMs to generate the response when the audio is noisy. Conversely, the computing device may use simpler, faster, and less power intensive LLMs to generate the response when the audio is cleaner.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS