Defensive Publications Series

Automated Prompt Optimization for Large Language Model Training Data Synthesis

Abstract

The generation of high-fidelity synthetic training data for large language models is often constrained by a reliance on static, manually engineered prompts. A single optimized prompt typically fails to achieve high performance across an entire query distribution, leading to coverage gaps and reduced diversity in the resulting training trajectories. To address these limitations, an automated prompt optimization method is disclosed. An evolutionary process is employed to iteratively refine prompts by leveraging a meta-prompting mechanism that incorporates task-specific chain-of-thought guidelines. Rather than converging on a single global optimum, a set of high-performing prompts is identified. Candidate responses are generated for each query using this ensemble of prompts, and a selection module identifies the highest-quality trajectory for inclusion in the training corpus. This approach improves the robustness and accuracy of fine-tuned models by maximizing the correctness of synthetic datasets and recovering performance on edge-case queries.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Wu, Hui; Zhou, Yulou; Bhardwaj, Rishabh; and Zhao, Simon, "Automated Prompt Optimization for Large Language Model Training Data Synthesis", Technical Disclosure Commons, (May 17, 2026)
https://www.tdcommons.org/dpubs_series/10137

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automated Prompt Optimization for Large Language Model Training Data Synthesis

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automated Prompt Optimization for Large Language Model Training Data Synthesis

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information