Multiple simultaneous audio sources are typically handled by the media session and audio focus mechanisms of an operating system (OS). These mechanisms typically permit only a single audio source to output sound at any given time. Switching to another component that generates sound results in a switch from the currently playing audio to the audio of the new component which may not always provide an optimal user experience. Moreover, media session and audio focus mechanisms of the different operating systems, e.g., a guest OS running as a container or virtual machine atop a host OS may not interoperate in a seamless manner, thus making it difficult to provide a unified media experience. This disclosure describes mechanisms to handle multiple audio sources that request sound output at the same time. A single media session and audio focus service is provided that handles audio output requests from all audio sources. The service is designed to aggregate the various media sessions and apply appropriate rules to determine which of the multiple simultaneous requests for audio playback are played at any given time.

