Abstract
Performance of generative models (GMs), such as Large Language Models (LLMs), is highly dependent on the quality of the input prompt that is processed using the GM. However, manual prompt engineering is often inefficient and requires specialized expertise. This disclosure describes a method for the automated optimization of structured prompts through an iterative evaluation process. A structured prompt template is initially populated using task descriptions and reference data to generate candidate seed prompts. These candidate seed prompts are executed against a targeted model, and the resulting responses are assessed by an automated evaluator. Based on performance metric(s), a refinement loop is initiated where prompts are iteratively mutated and re-evaluated. This process integrates a chain-of-thought methodology within the instructions to improve reasoning capabilities for complex tasks such as classification, query generation, and/or summarization. The automated refinement continues until performance thresholds are met, resulting in an optimized prompt that enhances accuracy of output of a GM and reduces the manual effort typically required for prompt development.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Wu, Hui; Huang, Haoda; Shen, Samuel; Filippi, Enrica; Zitouni, Imed; Zhao, Simon; and Fisher, Zach, "Automated Optimization of Structured Large Language Model Prompts via Iterative Evaluation", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/9189