Abstract

Voice communication can be difficult for those with impaired or accented speech. When such users communicate with others via applications on their devices, listeners often find it difficult to understand them. This disclosure describes techniques that dynamically process impaired or accented speech and convert it to synthesized canonical speech with permission. Generation of the synthesized speech is performed with low latency as a user speaks, enabling the parties to engage in smooth communication that is unaffected by the speaker’s speech impairment. The listeners receive clear, fluent speech automatically generated by suitably trained models. In addition, users can personalize the operation based on their specific speech impairments. The techniques can be integrated within any messaging, conferencing, or phone calling/ dialer application on any device and can make the applications more accessible to users with impaired speech and enhance the user experience.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS