Abstract
Developing artificial intelligence (AI) agents for complex organizational settings may be constrained by a scarcity of high-fidelity training data, as real-world datasets can present privacy concerns and some synthetic data may lack causal depth. The disclosed systems and methods describe a multi-agent framework for generating synthetic organizational data. This framework can procedurally generate a social topology of simulated individuals and groups, where agents powered by large language models may operate within a temporal simulation loop to make decisions and perform actions based on a given context. An output of the process can be a persistent, queryable historical record of work artifacts, such as emails and documents, generated by agent actions. This can create a structured, causally-linked dataset that mimics the digital footprint of a human organization, serving as a resource for training and evaluating AI agents while potentially mitigating certain limitations of real-world data.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Bernico, Mike; Chen, Hanwen; Thodoroff, Pierre; Javadzadeh, Sara; Zhang, Xiao; Chen, Stephen; and Afshari, Saeedeh, "Multi-Agent Simulation for Generating Synthetic Datasets of Organizational Artifacts", Technical Disclosure Commons, (April 09, 2026)
https://www.tdcommons.org/dpubs_series/9745