Abstract

Users that work with time series data typically disaggregate time series problems into various isolated tasks and use specific libraries, packages, tools, and services that deal with each individual task. However, the tools used are often fragmented. Analysts have to load different packages for common tasks such as data preprocessing, clustering, feature extraction, forecasting, hierarchical reconciliation, evaluation, and visualization. This disclosure describes a reliable, scalable infrastructure to meet various needs of time series practitioners without adding engineering overload. The infrastructure is modularized and the modules are connected in a flow type declarative language which makes the infrastructure extensible and future proof. Practitioners can use the entire infrastructure or only certain modules, while performing other operations using first or third party libraries or pipelines.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS