Abstract
Hardware and software systems such as data centers, compute clusters, etc., have a high degree of parametric configurability, e.g., dozens of true/false settings, constants, thresholds, etc. Parametric settings can significantly impact performance. The space of parameters is too vast to explore manually. Parametric settings in test environments may not predict behaviors in production environments. This disclosure describes techniques that leverage machine learning to explore the parameter space of a hardware/software system and to choose an optimal set of parameters that provides a performant system. Components used to implement the techniques include a control system for describing and adjusting parameters within their permissible ranges; a service that receives requests from the control system and actuates new behaviors based on the parameters; a performance log that is analyzed by machine learning algorithms; and a feedback mechanism to enable the control system to dynamically adjust parameters based on the optimization performed by a machine learning module.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
NA, "Smart Flags for Data Center Fleet Optimization", Technical Disclosure Commons, (June 25, 2025)
https://www.tdcommons.org/dpubs_series/8276