Abstract
Existing content recommendation algorithms often do not effectively use structured information to make more informed decisions. This can limit the ability to discover similar content or to introduce sufficient diversity into recommendations. In addition, or in the alternative, current techniques for enhancing content recommendations may lack personalization, diversity, and transparency. These existing techniques may also present static sets of content items (e.g., content clusters) that may not adapt to real-time events. Also, content cluster labels may not clearly describe the underlying theme of the content items in the content cluster.
This disclosure describes technology for generating content recommendations using a graph-based structure. A content item to content item graph is created where content items are represented as nodes in the graph. Edges between the nodes are established and weighted based on the similarity of Large Language Model (LLM) embeddings derived from textual summaries of the content item at the content node. These edge weights can be further modified using data from other embedding sources and personalized based on user interaction data with the content items. Recommendations are generated by sampling neighboring nodes from content items a user has previously engaged with, using the edge weights as a probability distribution of the interest of the user in a content item.
The disclosed technology provides a structured approach to content recommendation and discovery. The disclosed technology may facilitate the personalizing of content recommendations based on user interactions with past content items and the similarity between the past content items and other content items. In addition, or in the alternative, the disclosed technology may enhance content discovery for a user resulting in increasing the diversity in the content recommendations presented to the user.
This disclosure also describes technology for enhancing content recommendations using the Large Language Models (LLMs) to analyze user data, such as viewing history and interactions of the user with content items, to generate personalized affinity profiles. The disclosed technology uses an LLM to analyze content cluster composition to identify biases, to generate natural language explanations for content recommendations, and to create improved descriptive content cluster labels. Additionally, the LLM may analyze trending topics to suggest adjustments to content cluster composition in real time. The disclosed technology may improve the accuracy, diversity, fairness, and relevance of content recommendations, while also enhancing user trust and transparency through clearer explanations and content cluster descriptions.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Chatterjee, Tamojit and Nayak, Shravan, "Large Language Model Embedding Graph for Content Recommendation and Discovery", Technical Disclosure Commons, (September 29, 2025)
https://www.tdcommons.org/dpubs_series/8651