Abstract
The presented proposal discloses a system and method for controlling linguistic model training using multimodal emotional confidence gating. The proposed system receives audiovisual media comprising spoken dialogue and associated visual, acoustic, and contextual signals, and automatically segments the media into scene portions based on detected emotional stability characteristics. A multimodal emotion analysis module generates an emotional confidence score for each scene portion by fusing facial expression features, vocal prosody parameters, temporal emotional consistency indicators, and contextual correlation data. A training control module compares the emotional confidence score against a configurable threshold and selectively authorizes or suppresses linguistic parameter updates of a language learning model based on the comparison result. Linguistic learning operations, including token extraction, phrase association, and informal language pattern acquisition, are performed only for scene portions satisfying the emotional confidence condition, while remaining portions are excluded from training. Learned linguistic representations are persistently associated with corresponding emotional state vectors for subsequent emotion-aware inference. By conditioning language acquisition on emotionally reliable media segments, the proposal reduces semantic noise, improves learning efficiency, and enhances the model’s ability to understand and generate emotionally nuanced and informal spoken language without reliance on predefined textual annotations.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Niranjan, Chiranthan, "System and Method for Emotion-Gated Linguistic Model Training Using Multimodal Emotional Confidence Evaluation", Technical Disclosure Commons, (January 12, 2026)
https://www.tdcommons.org/dpubs_series/9176