Abstract
A self-contained columnar record format stores a user engagement sequence across multiple traits using multiple tiers having different time windows and granularities. The record includes a header that specifies tier descriptors (including entry counts, byte offsets, and encoder identifiers) and trait descriptors (including data types and trait-specific aggregation functions). A rolling compaction pipeline moves data from finer tiers to coarser tiers by bucketizing by the coarser granularity and applying the stored aggregation function per trait (e.g., SUM, MAX, MIN, MODE, OR, SAMPLE_TOP_K, WEIGHTED_AVG). A resolution-aware decode API receives a resolution budget selecting a subset of tiers and optional per-tier entry limits, and decodes only the selected tiers using the indicated encoders, enabling different consumers to obtain different temporal resolutions with decode cost proportional to requested tiers.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Anonymous, "Hierarchical Multi-Resolution Columnar Encoding with Trait-Specific Aggregation for User Engagement Sequence Feature Stores", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/10767