Defensive Publications Series

A novel encoding & alignment of Histograms of referential integrity columns for scalable data generation

Suresh Soundararajan, Hewlett Packard EnterpriseFollow

Abstract

Testing the performance of database management systems is often accomplished using synthetic data and workload generators such as TPCH and TPCC. However most synthetic benchmarks don’t fully match customer database configurations. Customer database configuration data-sets are typically hard to obtain due to their sensitive nature and prohibitively very large sizes. As a result, oftentimes the data management systems are not thoroughly tested, and performance related bugs are commonly discovered after deployment, where the cost of fixing is very high. We propose a scalable data generator called XGen, an approach to generating data-sets out of customer metadata information, including integrity constraints and histogram statistics. Handling multiple referential integrity constraints is a very hard problem and we handle it in a very novel way by indirectly encoding the column dependencies so that we can still independently generate the column data for scalable data generation.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Recommended Citation

Soundararajan, Suresh, "A novel encoding & alignment of Histograms of referential integrity columns for scalable data generation", Technical Disclosure Commons, (January 10, 2019)
https://www.tdcommons.org/dpubs_series/1868

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

A novel encoding & alignment of Histograms of referential integrity columns for scalable data generation

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

A novel encoding & alignment of Histograms of referential integrity columns for scalable data generation

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information