Abstract
Recent research has demonstrated that selecting a carefully curated subset of long chain-of-thought (COT) trajectories can significantly improve the performance of reasoning large language models (LLMs) in mathematical question and answer tasks. Proposed herein are techniques to extract long COT reasoning traces, which can be useful for fine-tuning artificial intelligence (AI) models, such as LLMs.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Payani, Ali and Kompella, Ramana, "GRAPH SIMILARITY-BASED SELECTION OF LONG CHAIN-OF-THOUGHT TRAJECTORIES FOR MODEL TRAINING", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/10235