Inventor(s)

NAFollow

Abstract

Large and complex software systems are critical in many contexts. Outages and/or security incidents for such systems can have a negative monetary impact and lead to user dissatisfaction. Root cause analysis for system outages is performed manually, which is a costly and time-consuming process. Root cause analysis relies on time series data and logs from monitored systems, records of changes to the monitored systems, outages of critical infrastructure utilized by the monitored systems, etc. This disclosure advantageously utilizes the capabilities of a large language model to ingest large amounts of data and perform reasoning tasks in response to prompts. Per the techniques, relevant data about a monitored system is provided to an LLM along with a suitable prompt that instructs the LLM to perform root cause analysis. The LLM output is utilized by engineering teams to determine and execute mitigation strategies. The prompt is updated and the LLM additionally trained based on the performance of the LLM in performing the root cause analysis.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS