Techniques are provided for speech recognition in real-time with a semi-supervised model based on Wav2vec2.0. Only minimal training data is required, thereby enabling service of under-represented/low resource languages at a quality level comparable to more widely available languages.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Le Groux, Sylvain and Huang, Zili, "PRODUCTION-GRADE ONLINE SPEECH RECOGNITION FOR LOW-RESOURCE LANGUAGES", Technical Disclosure Commons, (November 14, 2022)