Abstract

In the context of large language models (LLMs), memory bandwidth has not kept pace with the computational abilities of machine learning processing units, such that memory is a performance bottleneck. This disclosure describes a memory controller and data retrieval techniques that use semantic search to reduce the complexity of memory access, and, by clearly enforcing data requirements, reduce the amount of data that needs to be transferred. In contrast to traditional memory controllers, which require precisely specified data locations, the described memory controller uses contextual intelligence to determine the importance of requested data, such that only essential data is retrieved. This results in data compression and the saving of resources such as storage capacity and memory bandwidth. Furthermore, the controller offloads memory access patterns, thereby freeing machine learning processing units for primary tasks. The controller scales with and conforms to the requirements of artificial intelligence (AI) generally and LLMs in particular.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

NA, "Semantic Search Based Memory Controller to Accelerate LLMs and Foundational Models", Technical Disclosure Commons, (November 15, 2024)
https://www.tdcommons.org/dpubs_series/7537

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Semantic Search Based Memory Controller to Accelerate LLMs and Foundational Models

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Semantic Search Based Memory Controller to Accelerate LLMs and Foundational Models

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information