Application performance can be affected adversely due to many reasons. Some of these sources of performance degradation remain hidden from user visibility due to their very nature of occurrence, such as misconfigured hardware or firmware settings, or a faulty piece of hardware that is designed to keep the system functional, albeit on a reduced horsepower. Prevalent monitoring approaches rely on such performance degradations having persisted in the platform before they can be acted upon. In this paper, we propose a novel, “firmware-first”, rules-based approach for early detection of performance anomalies, both during the boot process and at runtime (where the anomalies may manifest due to autonomous recovery actions taken either in hardware or firmware), and propagating such anomalies to standard user-visible interfaces, to help customers make informed decisions before deploying their services.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.