Abstract

Traditional large language models (LLMs) are single, massive models that are expensive and time-consuming to train. A mixture of experts (MoE) architecture attempts to ameliorate the relative inflexibility of traditional LLMs by activating only a subset of the model parameters (known as experts) during each inference step. However, even MoEs lack modularity, as their experts are trained together and generally do not collaborate. This disclosure describes an LLM architecture and techniques that utilize multiple, complete, and independently pre-trained LLMs as modular macro-experts. The architecture, referred to as a heterogeneous macro mixture of experts (macro-MoE), includes a trainable gating network that dynamically routes input prompts to the most appropriate expert or sequence of experts. A dual-function orchestrator synthesizes parallel outputs for simple tasks and manages a collaborative, multi-step generation process for complex tasks by routing intermediate results between different experts. The techniques enable a highly modular, computationally efficient LLM capable of solving complex problems by leveraging the specialized strengths of diverse, state-of-the-art models in a cohesive, integrated framework.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Start, Johannes and Lunney, John, "Modular Language Model Architecture with Collaborative Routing Between Heterogeneous Experts", Technical Disclosure Commons, (August 18, 2025)
https://www.tdcommons.org/dpubs_series/8468

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Modular Language Model Architecture with Collaborative Routing Between Heterogeneous Experts

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Modular Language Model Architecture with Collaborative Routing Between Heterogeneous Experts

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information