Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime
Infrastructure as Code is the practice of automating the provisioning, configuration, and orchestration of network nodes using code in which variable values such as configuration parameters, node hostnames, etc. play a central role. Mistakes in these values are an important cause of infrastructure defects and corresponding outages. Ansible, a popular IaC language, nonetheless features semantics which can cause confusion about the value of variables.
In this paper, we identify six novel code smells related to Ansible’s intricate variable precedence rules and lazy-evaluated template expressions. Their detection requires an accurate representation of control and data flow, for which we transpose the program dependence graph to Ansible. We use the resulting detector to empirically investigate the prevalence of these variable smells in 21,931 open-source Ansible roles, uncovering 31,334 unique smell instances across 4,260 roles. We observe an upward trend in the number of variable smells over time, that it may take a long time before they are fixed, and that code changes more often introduce new smells than fix existing ones. Our results are a call to arms for more in-depth quality checkers for IaC code, and highlight the importance of transcending syntax in IaC research.