Abstract

The present disclosure provides a method and a system for scalable query sampling for large and imbalanced data. The method comprises determining the feature families in the data where each family of the features consists of correlated features. The method further comprises determining importance scores for the features in the data to filter the features that contribute significantly to the analysis of the agent system. Further, the method comprises determining the deviation effect scores for all the features of the corresponding queries and aggregating the deviation effect scores for the queries to determine a selection value. The sample of queries are determined based on the selection value.

Publication Date

2026-01-07

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS