This disclosure describes techniques to provide server overload protection in microservices based computing systems. Per techniques of this disclosure, new incoming traffic is throttled at a server if it is determined that backend processes downstream of the server are overloaded. The techniques can mitigate performance degradation of server stacks while enabling optimal computing resource utilization. Metrics associated with overloading of the backend processes are monitored by the server. If it is determined based on the metrics that one or more backend processes are overloaded, the server performs throttling whereby part of the new incoming requests to the server are rejected and thereby rate limited. A cost function threshold, e.g., a running window average of a cost function such as queries per second received and successfully processed by the server, is determined. When backend processes are overloaded, new requests are accepted at the server only when an incoming cost function meets the cost function threshold.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.