A multi-tenancy system hosts many clients using shared capacity and resources. Client requests (demand) may have different resource consumption and traffic pattern. The multi-tenancy system needs to protect itself from overload. Strategies and techniques for applying CPU based throttling to ensure fair use of resources are described. To ensure a comprehensive and flexible protection, the CPU based throttling is enforced at three levels: a global level, a regional level and a local level.
At the global level, quota-based CPU throttling is applied for each client. A backend system utilizes a global in-memory counter service to aggregate and keep track of a CPU usage (service cost) of the served client requests. A client-side rate limiter controls whether to serve the client requests based on a preconfigured quota and real-time CPU usage of the clients. An adaptive probabilistic throttling algorithm is utilized to maximize throughput while capping the CPU usage under the allocated client quota.
At the regional level, strategies for rate limiting are based on a regional average CPU load. When a region becomes overloaded, the backend system starts to turn down the requests from all clients in that region based on preconfigured client priorities until the capacity frees up again. When a region is under-utilized, soft throttling is applied to make best use of the capacity. The backend system makes best efforts to serve all traffic irrespective of whether clients have exhausted their quotas or not. When none of these situations occur, the quota based probabilistic throttling algorithm is utilized to serve the demand, as done at the global level.
At the local level, server-side throttling is applied as the last line of defense. Each of the hosts rejects requests when the CPU load of the host rises above a preset threshold. It protects individual hosts from overload when in-region load balancing does not work perfectly.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Anonymous, Anonymous, "CPU based throttling for a multi-tenancy platform", Technical Disclosure Commons, (October 16, 2019)