Techniques are described herein for using stateful availability to maintain a session when auto-scaling a stateful service instance cluster or when a service instance fails. This has many advantages over current approaches in which public cloud and Kubernetes deployments can only provide stateless availability due to a number of limitations when load balancing a cluster of service instances.

