Abstract

Systems for managing computational workloads on shared hardware resources, such as graphics processing units or other accelerators, may utilize a limited number of coarse-grained priority levels. This approach may not sufficiently differentiate between a wide spectrum of competing tasks, potentially leading to sub-optimal resource allocation. A described system can generate a granular, numerical priority score for a workload using a computational component, for instance, a multi-dimensional calculator. This component can process multiple input parameters associated with a request, such as a user's service tier, the request's latency sensitivity, and its feature modality, to compute a synthesized priority index. This granular score can enable scheduling and resource management systems to make more fine-grained distinctions between tasks, which may improve hardware utilization by opportunistically filling idle compute cycles. The system may also support dynamic priority adjustments, which can contribute to service stability and fair resource access.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS