Voice commands are commonly used for interaction with virtual assistant applications provided via user devices such as smart speakers, appliances, smartphones, etc. When a user provides permission, some voice-enabled applications upload the user’s speech data to a server using lossless compression to enable server-based recognition of the user command. The lossless nature of the transmission can take up significant network resources and receiving a response from the server can take a significant amount of time when the user has a slow network connection. This disclosure provides techniques that enable faster transmission for server-side processing of user speech data while retaining recognition quality. Allowing loss in the transmitted audio reduces the resources required for speech data transmission. To ensure that there is no loss of quality, the user’s environment is evaluated with user permission, to determine whether lossy transmission is feasible for the particular user speech.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Werthman, Jordan and Shires, Glen, "Automatic selection of audio compression for spoken commands", Technical Disclosure Commons, (May 06, 2019)