Abstract
Some conversational agents can be reactive and may have limited capabilities for building a detailed, evolving understanding of user preferences over time. This disclosure describes a proactive agent framework that can use hierarchical reinforcement learning to construct a longitudinal user model. A high-level policy, which may utilize a deep Q-network, can determine when and what to ask a user to manage a balance between information gain and potential user annoyance. A low-level policy may then generate a specific natural language query. User responses can be encoded and clustered to form a dynamic knowledge graph of interests. A graph neural network may operate on this graph to predict potential future user needs. This method can allow an agent to move beyond reactive command execution to proactively anticipate user needs and offer contextually relevant assistance, for instance, by managing an exploration-exploitation trade-off for information gathering.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Shin, D, "Hierarchical Reinforcement Learning for Proactive and Longitudinal User Modeling in a Conversational Agent", Technical Disclosure Commons, (November 06, 2025)
https://www.tdcommons.org/dpubs_series/8842