Defensive Publications Series

A Type-to-Talk Framework Using Generative Voice Cloning Methods for Privacy-Preserving Communications

Abstract

A method for providing a text-to-speech framework that generated speech that mimics a user’s voice is disclosed. The proposed method receives sample speech from the user, and generates speaker embeddings specific to the user. The speaker embeddings are generated using a neural network. The speaker embeddings are used to fine-tune a generative vocoder. The finetuned generative vocoder can be used to generate speech that mimics the speech patterns and vocal characteristics of the user. Thus, text entered by the user can be converted to audio that sounds like the user’s speech. The generated audio is then transmitted to other participants in a virtual meeting.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Shin, Dongeek, "A Type-to-Talk Framework Using Generative Voice Cloning Methods for Privacy-Preserving Communications", Technical Disclosure Commons, (July 31, 2023)
https://www.tdcommons.org/dpubs_series/6091

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

A Type-to-Talk Framework Using Generative Voice Cloning Methods for Privacy-Preserving Communications

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

A Type-to-Talk Framework Using Generative Voice Cloning Methods for Privacy-Preserving Communications

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information