Deep networks have shown success in many challenging applications, e.g., image understanding, natural language processing, etc. The success of deep networks is traced to the large numbers of neurons deployed, each with weighted interconnections to other neurons. The large numbers of weights result in classification accuracy, but also use significant memory.

This disclosure describes techniques to reduce the number of weights used in deep networks by representing the matrices of deep network weights as the Kronecker product of two or more smaller matrices. The reduction in weights is made possible by the observation that deep networks do not always use a majority of their weights. Training procedures are described for the resulting compressed network. The techniques of this disclosure enable deep networks to be deployed in small footprint applications, e.g., mobile or wearable devices. Applications with no immediate memory constraint, e.g., servers, also benefit by the greater speed of deployment enabled by the techniques herein.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.