Traditional audio devices such as over-the-ear or in-ear headphones are limited in their ability to provide a personalized and engaging listening experience. When using such audio devices in a noisy environment, it can be difficult for a user to focus on the audio that is most important. This disclosure describes the use of mixed reality audio to enhance the listening experience via headphones, earbuds, or other audio devices during normal use. Machine learning techniques are used to modify the audio per user preferences, e.g., to focus the audio on a single person talking while the user is in a group, to turn down noisy/competing audio such as other people talking in a busy/noisy group setting like a crowd or party, to enhance muffled words with clean versions, etc. The modified audio is played back via headphones, earbuds, or other device to provide a personalized listening experience.

