Investigating a NASA Cyclomatic Complexity Policy on Maintenance of a Critical System
Background
Monte is a critical system used by NASA for navigation and design of deep space missions (informally referred to as the “Google Maps to the stars” ). The system in its current state is a complex integration of hundreds of C++ and Python applications, with several versions used in over 40 missions over the last 18 years. Its continuous, reliable operation is considered critical to the operation for over 18 ongoing missions. This system has been under active, disciplined maintenance for over 20 years. Defect reports and new feature requests continuously arrive at the rate of 160 unique defect reports per year and 150 feature requests per year. The codebase is currently 800,000 SLOC and growing at a rate of 2750 per release and an average of 11.75 releases per year.
Recently a “complexity census” was performed on the maintenance of 2477 Monte system files consisting of a total of 413,275 SLOC. For each file, the following data were collected: SLOC, source lines of comments, cyclomatic complexity (CC), defects associated with the file, defects removed associated with the file, and the person-hours expended to remove them.
This study uses this data to empirically investigate the impacts of CC on defect proneness and maintenance efforts. Does higher CC associate with higher defect proneness? More effort to repair defects? If so, is there a CC after which action should be taken to reduce defect proneness and repair effort?
Aims
CC is widely cited as a good predictor of maintainability. Perhaps as a result of this, NASA instituted a policy “If CC is greater than 15, take action to reduce the complexity”
Regardless of what actions are taken, these practices add cost to the maintenance of the system. The concern is that these practices have not been empirically substantiated. Previous studies have questioned the effect of CC on increasing defect proneness and increasing defect repair effort. How much confidence can we have that reducing CC, does in practice, reduce defect proneness and repair effort? If CC does indeed reduce these, is 15 the magic number for which CC should be reduced?
The aim of this study is to provide empirical evidence to support (or refute) these two best practices. While we are also interested in economics (i.e. does the benefit of the practices outweigh the costs), this is not a focus of this particular study.
Method
We are keenly aware of the strong relationships between CC, SLOD, and SLOC which greatly complicate parsing out what each factor contributes to associated defects and defect repair effort. Indeed, several studies have argued that CC provides no more useful information than SLOC alone provides. In the face of this and strong correlations with SLOC, we use Negative Binomial regression to determine the degree CC contributes to defects reported and defect removal effort. This analysis is further complicated by non-trivial interactions, non-continuous responses, and high dispersion of the data. Cluster analysis is performed to determine if there is a CC threshold whereby their impacts are “notable”. The analysis takes a mainly empirical approach, making as few assumptions as possible and not relying on theoretical arguments. We make no attempt to validate that CC is the cause of significant impacts on associated defects and defect repair efforts.
Results
We find that CC, SLOD and SLOC are significant predictors of associated defects. SLOD and SLOC are significant predictors of defect proneness and defect removal effort. As has been cited in other works, the predictiveness of CC is no better than SLOC. We find a notable increase in associated defects and defect removal effort when CC is greater than 15.
Conclusions
The impacts of CC on defect proneness are with high confidence consistent with the expectations of the NASA policy. While the predictiveness of CC for defect proneness and defect removal effort is no better than SLOC alone, the information gleaned from them is useful for empirically validating the NASA policy.
Thu 22 SepDisplayed time zone: Athens change
13:30 - 15:00 | Session 2B - Technical Debt & Effort EstimationESEM Industry Forum / ESEM Emerging Results and Vision Papers / ESEM Technical Papers at Sonck Chair(s): Carolyn Seaman University of Maryland Baltimore County | ||
13:30 20mFull-paper | Asking about Technical Debt: Characteristics and Automatic Identification of Technical Debt Questions on Stack Overflow ESEM Technical Papers Nicholas Kozanidis Vrije Universiteit Amsterdam, Roberto Verdecchia Vrije Universiteit Amsterdam, Emitzá Guzmán Vrije Universiteit Amsterdam Pre-print | ||
13:50 15mVision and Emerging Results | An Experience Report on Technical Debt in Pull Requests: Challenges and Lessons Learned ESEM Emerging Results and Vision Papers Shubhashis Karmakar University of Saskatchewan, Zadia Codabux University of Saskatchewan, Melina Vidoni Australian National University DOI | ||
14:05 20mFull-paper | Bayesian Analysis of Bug-Fixing Time using Report Data ESEM Technical Papers Renan Vieira Federal University of Ceará, Diego Mesquita Getulio Vargas Foundation, César Lincoln Mattos Federal University of Ceará, Ricardo Britto Ericsson / Blekinge Institute of Technology, Lincoln Rocha Federal University of Ceará, João Gomes Federal University of Ceará | ||
14:25 15mTalk | Investigating a NASA Cyclomatic Complexity Policy on Maintenance of a Critical System ESEM Industry Forum | ||
14:40 15mVision and Emerging Results | An Empirical Study on the Occurrences of Code Smells in Open Source and Industrial Projects ESEM Emerging Results and Vision Papers Md. Masudur Rahman Institute of Information Technology (IIT), University of Dhaka, Abdus Satter University of Dhaka, Mahbubul Alam Joarder Institute of Information Technology (IIT), University of Dhaka, Kazi Sakib Institute of Information Technology, University of Dhaka DOI Media Attached |