Defensive Publications Series

Contextual Audio Ducking Using Real-Time Stem Separation and Spatial Manipulation

Abstract

The standard approach to overlaying text-to-speech (TTS) on background audio is to globally attenuate the volume of the background track and play the overlaid TTS. This approach degrades the user experience by uniformly stripping rhythm from the background audio. This disclosure describes an audio-mixing architecture that utilizes real-time audio source separation combined with dynamic spatial manipulation. Instead of attenuating the master volume, the incoming stereo audio is demultiplexed into distinct stems (e.g., vocals, drums). When a system audio event (such as a voice assistant speaking) is triggered, a mixing matrix is executed as follows. The vocal stem is ducked to remove frequency masking. The bass stem is boosted to maintain the rhythmic energy of the track. The stereo width of the remaining instrumental stems is increased. This pushes the background music to the edges of the spatial soundscape, carving out a clear acoustic center pocket that accommodates the inserted voice before seamlessly transitioning back to the original mix.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Ling, Gaetano; Rickerby, Joe; Pawle, Benjamin; Colville, Michael; Bertran, Ishac; and Lee, DK, "Contextual Audio Ducking Using Real-Time Stem Separation and Spatial Manipulation", Technical Disclosure Commons, (June 24, 2026)
https://www.tdcommons.org/dpubs_series/10561

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Contextual Audio Ducking Using Real-Time Stem Separation and Spatial Manipulation

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Contextual Audio Ducking Using Real-Time Stem Separation and Spatial Manipulation

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information