Abstract
Dynamic evaluation of an app requires running the app on a mobile device after having logged into the app with real credentials and assessing the performance in a changing environment. This disclosure describes techniques that dynamically evaluate apps by exploring the app to identify the right state, dynamically generating relevant user commands for that state, and subsequently evaluating the app at that state. The techniques use three primary agents, e.g., a crawling agent, an agent-under-test, and an automatic rating agent. The agents are managed by a central orchestrator and leverage a large language model (LLM). The techniques are applicable in situations where critical user journeys (CUJs) are dynamically generated using an input state such as a screenshot, and where tasks resulting from a CUJ are required to be evaluated. The techniques transform an abstract CUJ into a well-defined or concrete CUJ under the context of the state.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Tyamagundlu, Divya and Gupta, Pramod, "Evaluating Agentic Behaviors in Dynamic Environments", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/9770