Inventor(s)

Abstract

Techniques are described for configuring a multi-tier cache hierarchy for SSD-backed embedding stores based on measured or predicted Zipf access skew. Live embedding index accesses are monitored to estimate a Zipf exponent using a sliding window and a Hill estimator, and temporal reuse may be estimated from inter-batch overlap. Cache sizes for multiple tiers are derived using closed-form Zipf working-set relationships for target hit rates, including handling for exponents greater than one and approximations near one. Cache sizes may be dynamically reconfigured at runtime using smoothed exponent estimates and a deviation threshold, without restarting processing; evicted entries from an upper cache may be demoted to a lower cache. Offline logs may be analyzed to predict tier hit rates, SSD IOPS, latency, and throughput, and to recommend cache and hardware configurations.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS