Defensive Publications Series

STORAGE-AWARE SHUFFLE PLANNER (SASP): A STORAGE-CONSTRAINED SHUFFLE OPTIMISATION LAYER FOR DISTRIBUTED COMPUTATION ENGINES

Abstract

The present disclosure relates to the field of distributed data processing, in particular to storage-aware adaptive shuffle planning (SASP) for shuffle optimisation in distributed computation engines. SASP provides a storage-aware planning layer that restructures shuffle execution based on actual storage capacity. The system collects object-store telemetry, including bandwidth, IOPS, latency, throttling, and concurrency, to construct a storage pressure vector for each prefix. Concurrently, expected shuffle partition sizes are analysed from execution plans. Based on these inputs, SASP forecasts partition-level strain and generates a Shuffle Prefix Mapping Table (SPMT) to assign partitions to output prefixes, ensuring balanced I/O load. A plan rewriter modifies execution plans to incorporate storage-aware partitioning. During execution, a custom ShuffleWriter writes data to assigned prefixes, and a ShuffleReader retrieves data using the mapping. The system further supports dynamic adaptation by redirecting partitions across prefixes in response to throttling, enabling improved performance and reliability.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Verma, Ananya and Khan, Raouf, "STORAGE-AWARE SHUFFLE PLANNER (SASP): A STORAGE-CONSTRAINED SHUFFLE OPTIMISATION LAYER FOR DISTRIBUTED COMPUTATION ENGINES", Technical Disclosure Commons, (April 16, 2026)
https://www.tdcommons.org/dpubs_series/9837

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

STORAGE-AWARE SHUFFLE PLANNER (SASP): A STORAGE-CONSTRAINED SHUFFLE OPTIMISATION LAYER FOR DISTRIBUTED COMPUTATION ENGINES

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

STORAGE-AWARE SHUFFLE PLANNER (SASP): A STORAGE-CONSTRAINED SHUFFLE OPTIMISATION LAYER FOR DISTRIBUTED COMPUTATION ENGINES

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information