Abstract

Techniques are described for speculative data prefetching in systems in which an LLM agent dynamically invokes retrieval backends whose candidates require embedding lookups from SSD-backed embedding tables. A user memory state with tiered memory data is obtained and converted into features. An intent prediction model produces, for each backend, a probability that the agent will invoke the backend. A prefetch controller computes a backend-specific prefetch decision using a value metric that combines probability, expected embedding lookups, SSD latency, and prefetch cost, and selects backends subject to a bandwidth budget based on SSD throughput and a Phase 1 time window. Speculative SSD reads are issued asynchronously during Phase 1 to populate cache prior to tool invocation in Phase 2. Mid-flight reasoning signals may update probabilities, enabling cancellation or late initiation of prefetch. Observed invocations may be logged for online refinement of the intent model.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Anonymous, "Speculative Data Prefetching Based on Predicted Agent Intent in Machine Learning Systems", Technical Disclosure Commons, (June 30, 2026)
https://www.tdcommons.org/dpubs_series/10773

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Speculative Data Prefetching Based on Predicted Agent Intent in Machine Learning Systems

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Speculative Data Prefetching Based on Predicted Agent Intent in Machine Learning Systems

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information