The quality of data is typically measured on a subset of all available data. It is of interest to know if such a measurement, performed on a subset of data, is representative of the entire corpus of data. This disclosure describes techniques that use historical data and metadata of a given time series to determine the set of useful data quality checks that can exist. The set of useful data quality checks is compared to the actual set of data quality checks to provide a percentage of data quality coverage that a given data set has.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Lees, W. Max; Liu, Yang; Lee, Steven; Li, Mingyang; He, Keyu; Wu, Eric; Cunningham, Emmett; Cruz, David Rissato; and Ezete, Chioma, "Data Quality Coverage", Technical Disclosure Commons, (September 13, 2021)