Abstract
In decentralized machine learning environments, a local model's auto-regressive sequence generation may have performance limitations compared to a larger, centralized model, as it could lack a mechanism to leverage external knowledge during inference. A gated, multi-source forcing mechanism can dynamically evaluate a local model's uncertainty during the generation process, for example, by calculating prediction entropy. If uncertainty exceeds a configurable threshold, a token may be retrieved from an external teacher source, such as a remote centralized model or an on-device data corpus. This retrieved token can then be injected into the local model's decoding stream to guide its subsequent generation. This approach may allow a local model, such as one on a smartphone or wearable device, to be selectively guided by external knowledge, potentially improving the quality and accuracy of the output while managing resource usage by invoking external sources when determined to be beneficial.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Carbune, Victor and Hartmann, Florian, "Gated Multi-Source Forcing for Auto-Regressive Sequence Generation", Technical Disclosure Commons, (December 21, 2025)
https://www.tdcommons.org/dpubs_series/9066