Techniques are described herein for storage systems to guarantee low latency for small requests while maintaining the system’s optimal overall throughput. It batches requests, classifies them, fairly allocates resources to them, and provides a mechanism to expedite the processing of small requests. In today’s cloud environments, it is critical that diverse applications run by multiple users can share access to generic storage systems without affecting each other’s performance.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.