Abstract
The technology described in this paper relates to scheduling large-scale computational tasks, such as those related to large language models (LLMs), on specialized hardware accelerators. The technology utilizes a decision engine that dynamically evaluates external energy market data, utility grid status, and renewable generation forecasts concurrently with internal task characteristics, including service level objectives (SLOs). By employing a multi-factor cost function, the technology seeks to balance adherence to task deadlines against minimization of operational electricity costs and reduction of strain on the power grid. This allows for high-priority, low-latency requests to be processed immediately, while delay-tolerant batch requests are strategically scheduled for periods characterized by lower energy costs or higher availability of renewable energy, thereby optimizing both operational expenditure and grid stability.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
N/A, "Energy-Cost and Grid-Aware Scheduling of Computational Tasks on Hardware Accelerators", Technical Disclosure Commons, (October 28, 2025)
https://www.tdcommons.org/dpubs_series/8804