Abstract

The generation of high-fidelity synthetic training data for large language models is often constrained by a reliance on static, manually engineered prompts. A single optimized prompt typically fails to achieve high performance across an entire query distribution, leading to coverage gaps and reduced diversity in the resulting training trajectories. To address these limitations, an automated prompt optimization method is disclosed. An evolutionary process is employed to iteratively refine prompts by leveraging a meta-prompting mechanism that incorporates task-specific chain-of-thought guidelines. Rather than converging on a single global optimum, a set of high-performing prompts is identified. Candidate responses are generated for each query using this ensemble of prompts, and a selection module identifies the highest-quality trajectory for inclusion in the training corpus. This approach improves the robustness and accuracy of fine-tuned models by maximizing the correctness of synthetic datasets and recovering performance on edge-case queries.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS