Abstract
A unified lifecycle system supports dual-vocabulary LLM-enhanced recommendations using natural language tokens and semantic identifier (SID) tokens. Cross-architecture migration preserves accumulated SID knowledge by learning an embedding projection that minimizes pairwise similarity distortion and enforces neighborhood contraction for K-nearest SID neighbors, followed by staged initialization, embedding warmup, continued pretraining with synthetic domain replay, and validation. Modality-aware distillation applies dual temperatures, using higher temperature for natural language and lower temperature for SIDs to avoid catastrophic SID substitutions, with long-tail SID importance weighting, SID embedding-alignment, and a closed-loop controller that adjusts temperatures and loss weights when SID quality degrades. Serving uses staleness-tiered amortization: offline item SID embeddings, near-line user embeddings, O(1) real-time adaptation, and an online lightweight fusion network. A cross-tier consistency protocol decays real-time state upon near-line refresh to prevent double-counting, enabling low latency and reduced memory.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Anonymous, "Dual-Vocabulary LLM Recommendations", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/10645