Abstract

Certain automatic designs of neural networks not only minimize prediction error but also shrink or prune the network to reduce inference latency. Targeting inference latency directly is difficult; hence, FLOP-count is often used as a proxy for inference latency. However, FLOP-count is only loosely correlated with inference latency.

This disclosure describes techniques for direct computation or measurement of targeted costs such as inference latency, energy consumption, throughput, model size, etc. By integrating such targeted costs into design procedures, high performance neural networks of low inference latency, model size, and energy consumption can be obtained. The techniques find application in domains where fast, low-powered neural networks are advantageous e.g., image classification, language translation, optical character recognition, self-driving cars, interactive augmented/virtual reality, etc.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Movshovitz-Attias, Yair; Poon, Andrew; Gordon, Ariel; and Eban, Elad Edwin Tzvi, "Design of neural networks based on cost estimation", Technical Disclosure Commons, (January 28, 2019)
https://www.tdcommons.org/dpubs_series/1916

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Design of neural networks based on cost estimation

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Design of neural networks based on cost estimation

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information