Inventor(s)

Abstract

Systems and methods are described for reducing SSD read volume during embedding table training by performing batch-aware cross-iteration cache reuse prediction. For a next training batch, a controller tests batch indices against a probabilistic cached-index membership structure representing indices likely resident in an embedding cache based on one or more prior iterations. A differential prefetch set is formed from indices predicted to be uncached, and an SSD prefetch is issued for only that set while maintaining batch-structure metadata such as offsets. The system may compute overlap statistics between consecutive batches using GPU-accelerated operations, track moving averages over a sliding window, and adaptively enable or disable differential prefetching based on observed overlap and estimated overhead. Multi-step prediction may be performed by maintaining a ring buffer of K membership structures and querying them with union semantics. False positives may lead to demand fetches while maintaining correctness.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS