Abstract

The present disclosure provides a novel method wherein a classifier-based drift scoring model is initially constructed using historical data. The derivation of SHAP (SHapley Additive exPlanations) values from the drift scoring model is employed to discern the individual contributions of various features to the calculated drift score. To enhance the granularity of drift pattern comprehension, new data is systematically categorized into multiple clusters based on the similarity of their drift patterns, as elucidated by SHAP values. The performance degradation of a deployed predictive model is then meticulously evaluated under distinct drift patterns. This evaluation offers detailed insights into the differential impacts of feature drift on the predictive model. Decisions regarding the necessity of updating a deployed model are made by considering both population-level and cluster-level performance degradation estimations. This dual assessment ensures a comprehensive understanding of the global impact as well as the specific effects within designated drift pattern clusters. A systematic pipeline is proposed for the continual update of the drift monitoring system over time. This pipeline ensures the perpetual relevance of the measured drift patterns and the estimations of model degradation, aligning them with the evolving data landscape. Regular updates to the drift monitoring system maintain the accuracy and efficacy of decision-making processes concerning model updates.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS