Abstract

Audio transcription systems typically require that the language being spoken be specified explicitly such that a language-specific transcription technique can be employed. In cases where the spoken language is not explicitly indicated, a possible approach is to process the audio via all available transcription techniques and choose the transcription associated with the highest confidence. Such an approach does not scale to support a large number of natural languages and is computationally expensive. This disclosure describes automatic identification of the natural language being spoken in audio input. The audio is processed using a trained machine learning model to output a language code corresponding to the language being spoken. Such a two-step approach, with a language identification step preceding the transcription step, enables supporting a large number of natural languages without incurring computational costs, latencies, and inaccuracies of employing multiple transcription techniques in parallel.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Abdelaziz, Omar and Greaves, Alex, "Automatic Identification Of Spoken Language In Audio Clips", Technical Disclosure Commons, (November 21, 2019)
https://www.tdcommons.org/dpubs_series/2711

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automatic Identification Of Spoken Language In Audio Clips

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automatic Identification Of Spoken Language In Audio Clips

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information