Distributed Serving of Partitioned Machine Learning Models Across Heterogeneous Hardware Resources

Vince GattoFollow
Junyan LiFollow
Chi ChenFollow
Hang FuFollow
Brad FroehleFollow
Smit HinsuFollow
Devanshu JainFollow
Aniruddh NathFollow
Ruolin JiaFollow
Shuo ChangFollow
He MaFollow
Daokun JiangFollow

Abstract

The present disclosure relates to systems and methods for serving machine learning models by partitioning model components for distributed execution across heterogeneous hardware resources. In particular, the disclosure describes techniques for splitting models into independently executable subgraphs deployed across a cluster of machines with varying hardware configurations (e.g., CPUs, GPUs, TPUs). The system leverages the distinct strengths of different hardware types by mapping memory-intensive or preprocessing-heavy components to CPU machines, while assigning compute-intensive subgraphs to accelerators such as TPUs. The distributed architecture employs a runtime system that manages inter-machine communication and orchestration of inference tasks, enabling improved utilization, scalability, and performance for serving large, embedding-heavy models.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Gatto, Vince; Li, Junyan; Chen, Chi; Fu, Hang; Froehle, Brad; Hinsu, Smit; Jain, Devanshu; Nath, Aniruddh; Jia, Ruolin; Chang, Shuo; Ma, He; and Jiang, Daokun, "Distributed Serving of Partitioned Machine Learning Models Across Heterogeneous Hardware Resources", Technical Disclosure Commons, (April 14, 2025)
https://www.tdcommons.org/dpubs_series/8008

Technical Disclosure Commons

Defensive Publications Series

Distributed Serving of Partitioned Machine Learning Models Across Heterogeneous Hardware Resources

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Distributed Serving of Partitioned Machine Learning Models Across Heterogeneous Hardware Resources

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information