Abstract

Recent research has demonstrated that selecting a carefully curated subset of long chain-of-thought (COT) trajectories can significantly improve the performance of reasoning large language models (LLMs) in mathematical question and answer tasks.  Proposed herein are techniques to extract long COT reasoning traces, which can be useful for fine-tuning artificial intelligence (AI) models, such as LLMs.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS