Abstract
A system is described for compositing a user's image into media streams protected by digital rights management (DRM). The system may operate via a multi-stage pipeline, which can begin with context extraction on a user device, such as a smartphone or smart television. Computer vision models may analyze a media stream to create a context packet detailing information such as scene lighting, spatial depth, and artistic style. This packet can then be used to condition a generative artificial intelligence (AI) model, which may synthesize the user's likeness into the scene with corresponding visual characteristics. A resulting composite can be rendered in a sandboxed environment as an ephemeral overlay to assist in protecting the integrity of the source DRM-protected content. This technique may facilitate the creation of personalized visual insertions from streaming media, potentially providing for an interactive viewing experience.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Daftari, Dev and Verma, Akash, "System for Real-Time User Compositing into Media Streams Using Generative AI Conditioned on Extracted Scene Context", Technical Disclosure Commons, (April 24, 2026)
https://www.tdcommons.org/dpubs_series/9917