Abstract

A machine-learning model that automatically converts audio streams from an audio-visual content from a source language to a destination language is described. In response to determining that an audio stream should be translated, a machine-learning-based dubbing model is invoked for a specific destination language. In case of multiple speakers, voice embedding techniques are used to match dubbed audio streams to the corresponding speakers. The sentiment in the original speaker’s voice is preserved by training the model with targeted data set in the destination language.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Feuz, Sandro and Barekatain, Mohammadamin, "AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS", Technical Disclosure Commons, (December 14, 2018)
https://www.tdcommons.org/dpubs_series/1778

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information