Inventor(s)

Abstract

A validation layer for agentic recommendation systems validates psychological reasoning used by an LLM-based recommendation agent. A non-LLM behavioral ground truth estimator computes psychological-state estimates from behavioral signals with confidence values. A non-LLM reasoning extractor parses agent reasoning traces into structured psychological-state claims. Claims are compared to ground truth subject to confidence thresholds to compute per-state reasoning-quality metrics including precision, recall, and hallucination rate, optionally at granularities such as per topic, per user segment, and per prompt version. A hallucination detection pipeline may apply tiered responses including logging, soft override via ground truth injection, and hard override for safety-relevant actions. Self-fulfilling hallucination loops are detected using pre-intervention or counterfactual baselines rather than post-decision signals. A prompt regression framework evaluates prompt versions on a golden set with per-state metrics, and controlled ablations isolate the causal contribution of psychological reasoning to ranking quality.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS