FSE 2026
Sun 5 - Thu 9 July 2026 Montreal, Canada

The integration of Artificial Intelligence (AI) into IT Operations Management (ITOM), commonly referred to as AIOps, offers substantial potential for automating workflows, enhancing efficiency, and supporting informed decision-making. However, practical implementation of AI within IT operations remains challenging, particularly due to data quality issues, the complexity of cloud-native environments, and skill gaps within operational teams. The emergence of Large Language Models (LLMs) presents new opportunities to address these barriers by leveraging their advanced natural language understanding, enabling the analysis of unstructured data such as logs, incident reports, and technical documentation. In this paper, we present an industry experience report conducted on Red Hat OpenShift to explore how LLMs can be operationalized in real-world Kubernetes-based environments. We integrate predictive machine learning models with LLM agents through tool-augmented reasoning, highlighting novel methods to automate IT tasks, enhance observability, and reduce operator burden. Our findings provide insights into both the capabilities and limitations of LLMs in production-grade AIOps scenarios.