Abstract

The techniques introduced here provide a reinforcement-based dynamic token injection system for guiding Large Language Model (LLM) reasoning during inference. A Small Language Model (SLM), using rewards for output quality and penalties for excessive length, may monitor the LLM reasoning output and selectively inject tokens that may prompt further LLM reasoning or conclude the LLM reasoning earlier.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Payani, Ali; Lee, Myungjin; and Kompella, Ramana, "DYNAMIC TOKEN INJECTION FOR ENHANCED LANGUAGE MODEL REASONING USING REINFORCEMENT LEARNING", Technical Disclosure Commons, (June 16, 2026)
https://www.tdcommons.org/dpubs_series/10466

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

DYNAMIC TOKEN INJECTION FOR ENHANCED LANGUAGE MODEL REASONING USING REINFORCEMENT LEARNING

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

DYNAMIC TOKEN INJECTION FOR ENHANCED LANGUAGE MODEL REASONING USING REINFORCEMENT LEARNING

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information