Chaos Engineering (CE) has emerged as a proactive methodology to improve the resilience of modern distributed systems, particularly within DevOps environments. Originally pioneered by Netflix, CE simulates real-world failures to expose weaknesses before they impact production. In this paper, we present a systematic gray literature review that investigates how industry practitioners have adopted and adapted CE principles over recent years. Analyzing 50 sources published between 2019 and early 2024, we developed a comprehensive classification framework that extends the foundational CE principles into ten distinct concepts. Our study reveals that while the core tenets of CE remain influential, practitioners increasingly emphasize controlled experimentation, automation, and risk mitigation strategies to align with the demands of agile and continuously evolving DevOps pipelines. Our results not only enhance our understanding of how CE is intended and implemented in practice but also offer guidance for future research and industrial applications aimed at improving system robustness in dynamic production environments.
Tasmia Zerin Institute of Information Technology (IIT), University of Dhaka, Moumita Asad University of California, Irvine, B M Mainul Hossain University of Dhaka, Kazi Sakib Institute of Information Technology, University of Dhaka