Abstract
A system related to measuring entity perception from user-generated video content using multimodal large language models. The system implements a multi-stage data ingestion pipeline using Knowledge Graph ID filtering to identify relevant videos, processes videos through a multimodal language model to generate attribute-specific perception scores with rationales, aggregates scores across temporal intervals with coverage metrics, and generates comparative visualizations with automated statistical analysis.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Raposo, Angel and Rathi, Prakhar, "Measuring Entity Perception From User-Generated Video Content Using Multimodal Language Models", Technical Disclosure Commons, (February 10, 2026)
https://www.tdcommons.org/dpubs_series/9309