Abstract
Current techniques to detect conversational failures during interaction between users and conversational artificial intelligence (AI) have many limitations - they capture only a tiny fraction of failures, lack context about the user's internal state during the conversation, and are not scalable. This disclosure describes techniques to automatically detect, classify, and log conversational failures in interactions between users and conversational AI agents to generate a rich, structured dataset. With user permission, conversations are analyzed to detect prosodic features and acoustic cues within context to determine likely occurrences of conversational failures. The conversational state is tracked using a state vector that encodes the nature of failure. Upon failure detection, a structured data entry with information such as conversational context, agent action, user feedback, inferred state vector, and type of failure is automatically generated and added to a corpus. This automatically generated corpus can be used to improve conversational AI agents by creating challenging evaluation sets, fine-tuning components, and providing training data.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Labzovsky, Ilia and Karmon, Danny, "Automatic Generation of a Labeled Corpus of Conversational Failures During Interaction Between a User and a Conversational AI Agent", Technical Disclosure Commons, (September 26, 2025)
https://www.tdcommons.org/dpubs_series/8643