Abstract
A distributed pod coordination system includes a plurality of pods configured to process work from a message queue, each pod assigned a unique pod identifier, and a state store maintaining coordination state including an active pods set, heartbeat keys, and partition maps associating partitions with respective pods. Each pod registers with the state store by obtaining the unique pod identifier, creating a heartbeat key with a time-to-live expiration value, and adding the pod to the active pods set. Each pod periodically refreshes the heartbeat key and participates in master election by querying the active pods set, verifying heartbeat keys, and determining a master pod based on a lowest pod identifier among pods having valid heartbeat keys. Each pod retrieves assigned partitions and processes corresponding work items. The master pod discovers partitions, distributes partitions using a partition assignment strategy, and stores partition assignments.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
KAUR, KOMALJOT; PANDEY, APURBA; and R V, VIGHNESH, "SCALABLE REPLAY SERVICE", Technical Disclosure Commons, (June 25, 2026)
https://www.tdcommons.org/dpubs_series/10566