Abstract
Automatic speech recognition (ASR) machine learning models are deployed on client devices that include speech interfaces. ASR models can benefit from continuous learning and adaptation to large-scale changes, e.g., as new words are added to the vocabulary. While federated learning can be utilized to enable continuous learning for ASR models in a privacy preserving manner, the trained model can perform poorly on rarely occurring, long-tail words if the distribution of data used to train the model is skewed and does not adequately represent long-tail words. This disclosure describes federated learning techniques to improve ASR model quality when interpreting long-tail words given an imbalanced data distribution. Two different approaches - probabilistic sampling and client loss weighting - are described herein. In probabilistic sampling, the federated clients that include fewer long-tail words are less likely to be selected during training. In client loss weighting, incorrect predictions on long-tail words are more heavily penalized than for other words.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Ding, Yuxin; Xiao, Yonghui; Mathews, Rajiv; Chen, Mingqing; and Zhou, Lillian, "Improved Federated Learning for Handling Long-tail Words", Technical Disclosure Commons, (September 08, 2023)
https://www.tdcommons.org/dpubs_series/6235