Abstract

When pairwise dot products are computed between input embedding vectors and the dot product is used for further computation, the number of dot products grows quadratically with the number of embedding vectors. This can cause an efficiency bottleneck and affect performance of machine learning models. This disclosure describes techniques to obtain a compressed dot product matrix from input sparse embeddings. The compressed embeddings are used to obtain a compressed dot product. The compressed embeddings are generated using a weights matrix that is initialized randomly and learnt alongside other parts of the model. To improve performance, attention weights derived from the input embeddings can be used as the weights matrix. Still further, a high level representation of the input embeddings can be obtained and combined with a low-level representation. The described compression techniques can improve model accuracy, as measured by normalized entropy and can improve model execution efficiency. The reduction in size of the dot product matrix, enabled by the described techniques, reduces computational complexity.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Anonymous, "Dot Product Matrix Compression for Machine Learning", Technical Disclosure Commons, (December 20, 2019)
https://www.tdcommons.org/dpubs_series/2807

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Dot Product Matrix Compression for Machine Learning

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Dot Product Matrix Compression for Machine Learning

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information