Abstract
Analyzing large-scale media datasets for purposes such as training artificial intelligence models may present inefficiencies when using separate, post-facto analysis pipelines that can involve redundant data processing. A framework may extract scene-level signals from media content concurrently with media transcoding operations. By leveraging a common decoding step for both transcoding and analysis, a system can partition media into scenes and compute quantitative signals related to, for example, visual quality, motion dynamics, and production attributes. The extracted signals can be stored in a structured, indexed database, which can create a queryable dataset from a media archive. This approach may provide a scalable and computationally efficient method for curating datasets used to pre-filter media, condition generative models, and evaluate model performance across specific content characteristics.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Verlani, Pooja and Adsumilli, Balu, "System for Scene-Level Signal Extraction Concurrent With Media Transcoding", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/9610