Abstract

On-device machine learning models can provide inferences quickly; however, such models often have a large size and need to be downloaded to the device. In many domains, such as optical character recognition or translation, the nature of the problem necessitates that several on-device models be made available.

This disclosure describes techniques to split inference computation between the device and a remote server by leveraging the observation that many ML models have domain independent layers that are shared between multiple models and domain specific layers that are specific to individual models. Per techniques described herein, an available smallest intermediate representation from the shared layers is transmitted to the server inference. Alternatively, a bottleneck layer is inserted between the domain-independent and domain-specific layers of the model. The shared layers of the model are on-device while the domain-specific layers are on the server. The use of the smallest intermediate representation eliminates the need to store large models locally, reduces the cost and delays of network communication by minimizing the data transmitted during inference, and allows leveraging the power and flexibility of server-based machine-learning models.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Jaggi, Jorim; Stucki, Yannick; Roos, Adrian; and Feuz, Sandro, "Shared On-device and Server-based Machine Learning Evaluation", Technical Disclosure Commons, (April 03, 2020)
https://www.tdcommons.org/dpubs_series/3094

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Shared On-device and Server-based Machine Learning Evaluation

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Shared On-device and Server-based Machine Learning Evaluation

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information