Inventor(s)

Abstract

Techniques are described for managing write amplification in recommendation systems that use large language models (LLMs) to semantically extract preference insights from user interactions and to propagate insights to similar users. A propagation controller applies hierarchical quotas including interaction-level confidence gating, per-user limits, and a system-wide budget. Neighbor selection uses graph-topology-aware fan-out reduction based on user out-degree in a similarity graph, with neighbors ranked by similarity and top-K selected. Work is scheduled into a two-tier priority queue in which fresh interaction processing is handled with strict priority over propagated updates. Low-priority propagation tasks may be temporally batched for coalescing. A graceful degradation policy throttles, drops, or pauses propagation responsive to queue depth and/or budget state while preserving processing of high-priority work.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS