This publication describes techniques to read text aloud and enhance text content to produce an audio experience. A user identifies text to read (e.g., news articles, email threads, social-media feeds, stories) on their computing device and saves the text to a Text Manager. The Text Manager may use a machine-learned model implemented on the computing device to produce high-quality text-to-speech that, for example, is conversational, well-paced, uses correct pauses, and uses an appropriate tone based on context. The machine-learned model may also determine context of the text to determine, for example, relevant sounds or music that may enhance the speech. The user can queue multiple pieces of content and listen at their leisure.

This work is licensed under a Creative Commons Attribution 4.0 License.