Abstract
The present disclosure relates to a system for explaining deep learning fraud scores. The system comprises a deep learning model configured to process a sequence of raw transaction attributes and generate a fraud score. A concept definition framework includes a programmatically defined label set capturing semantic concepts associated with transactional behaviours. Concept activation vectors (CAVs) are established for each defined semantic concept based on historical transaction data. An integrated conceptual sensitivity (ICS) mechanism combines attributions from the deep learning model to the defined semantic concepts, providing an explanation for the generated fraud score. Additionally, a hierarchical explanation interface visualizes the contributions of each semantic concept to the fraud score and associates those concepts with individual attributes within the raw transaction data.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
SUTTON, DAVID; ANDERSON, MARK; SKALSKI, PIOTR; and ROZANOVA, YULIA, "SYSTEM AND METHOD FOR EXPLAINABLE DEEP LEARNING FRAUD DETECTION SCORES", Technical Disclosure Commons, (October 29, 2025)
https://www.tdcommons.org/dpubs_series/8806