Abstract

Evaluating complex conversational agents can be challenging, as static benchmarks or manual testing may lack the scale and behavioral diversity to detect some failure modes in agentic systems, and also because of the internal non-determinism and multi-step complexity of agentic workloads. A system is described for the automated generation of synthetic, persona-driven dialogues. A multi-agentic process can generate detailed user personas, which can then condition a large language model (LLM) to synthesize multi-turn conversations. A search algorithm can explore conversational paths, and a separate LLM acting as a judge can evaluate the dialogues for realism and persona adherence. This process can produce a large-scale and curated corpus of diverse synthetic test data. The corpus can be used as a testbed to quantitatively assess the performance and stability of conversational agents, potentially facilitating the identification and measurement of complex failure modes.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Kuligin, Leonid and Tschochohei, Max, "Automated Generation of Persona-Driven Dialogues for Evaluating Conversational Agents", Technical Disclosure Commons, (October 31, 2025)
https://www.tdcommons.org/dpubs_series/8823

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automated Generation of Persona-Driven Dialogues for Evaluating Conversational Agents

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automated Generation of Persona-Driven Dialogues for Evaluating Conversational Agents

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information