Defensive Publications Series

Personalizing Speech Recognition Based on User-entered Text

Swaroop RamaswamyFollow
Theresa BreinerFollow
Igor PisarevFollow
Dan ZivkovicFollow
Mingqing ChenFollow
Rajiv MathewsFollow
Lara McConnaugheyFollow

Abstract

This disclosure describes techniques that, with user permission, use text data entered by a user to improve automatic speech recognition. A pre-trained language model and a personalization plan are applied to text entered by the user to build a personalized language model. Using user-entered data for personalization advantageously personalizes the dictation experience even for users who seldom use dictation. Using shallow fusion, a personalized language model trained on user-permitted data is combined with an automatic speech recognition (ASR) model. The combination can provide recognition performance superior to that of the component models. Fusion with language models trained through federated learning, as described herein, can improve dictation quality without requiring access to large amounts of transcribed dictation data.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Ramaswamy, Swaroop; Breiner, Theresa; Pisarev, Igor; Zivkovic, Dan; Chen, Mingqing; Mathews, Rajiv; and McConnaughey, Lara, "Personalizing Speech Recognition Based on User-entered Text", Technical Disclosure Commons, (February 08, 2022)
https://www.tdcommons.org/dpubs_series/4887

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Personalizing Speech Recognition Based on User-entered Text

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Personalizing Speech Recognition Based on User-entered Text

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information