Abstract

Systems and methods are described for caching large language model (LLM) reasoning outputs in recommendation services using semantic profile versioning. Cache entries map a key including user identifier, user profile version, and candidate identifier to a reasoning output that includes a score, tier, and rationale. A profile version increments in one embodiment when a profile update delta is classified as a semantic change, including preference additions/removals or profile regeneration, and optionally confidence updates that cross configured thresholds, while session and timestamp updates are treated as non-semantic. A two-level cache may be used with a session cache and a cross-session cache. Upon a semantic change, partial invalidation is performed by identifying affected candidates using topic affinity; affected entries are invalidated while unaffected entries are migrated to the new profile version to preserve reuse across sessions.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Anonymous, "Semantic Profile Version-Based Cache Invalidation for Machine Learning Inference Systems", Technical Disclosure Commons, (June 30, 2026)
https://www.tdcommons.org/dpubs_series/10718

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Semantic Profile Version-Based Cache Invalidation for Machine Learning Inference Systems

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Semantic Profile Version-Based Cache Invalidation for Machine Learning Inference Systems

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information