Abstract

Leveraging the vectorizability of deep-learning weight-updates, this disclosure describes processing-in-memory (PIM) techniques for weight-updates in a large class of deep learning networks. Rather than importing the state of the deep-learning optimizers to the computational die, the techniques send gradients to a die of a high-bandwidth memory (HBM) stack and perform the modest number of optimizer updates in compute units located in the die. Since reads and writes are done inside the HBM stack, the techniques can substantially reduce the CPU-HBM bandwidth requirements. Weight-related memory traffic, dominant for multilayer perceptrons and transformers, is also reduced.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

N/A, "Processing-in-Memory for Weight Updates in a Neural Network Accelerator", Technical Disclosure Commons, (August 13, 2021)
https://www.tdcommons.org/dpubs_series/4537

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Processing-in-Memory for Weight Updates in a Neural Network Accelerator

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Processing-in-Memory for Weight Updates in a Neural Network Accelerator

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information