Inventor(s)

Abstract

Techniques are described for adaptive cache management in SSD-backed embedding table systems that serve both training and inference. A mode detection engine monitors access pattern features including access multiplicity, write frequency, batch sequentiality, and request concurrency, and classifies operation as training, inference, or hybrid. Based on the classification, a cache manager switches among mode-specific policies for locking, prefetch, eviction, and write-back. Training mode may use generation-based locking, DataLoader lookahead prefetch, distance-based eviction, and write-back of dirty entries on eviction. Inference mode may use reference-count locking to block eviction during active requests, request coalescing with index deduplication for prefetch, and frequency-weighted eviction without write-back. Hybrid mode may blend eviction scoring and apply conjunctive locking constraints. Runtime mode transitions may occur without restart.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS