An Empirical Study of Predicting Fault-prone Components and their Evolution
Predicting fault-prone components at an early stage is useful for an organization to ensure quality software delivery. Prioritizing tests becomes easier with the prediction of fault-proneness of the components since developers can allocate more time and resources to the “High Efault-prone components. Furthermore, test prioritization helps reduce the cost of regression and allows developers to take careful decisions regarding the sensitive components. This paper performs an empirical study of predicting fault-prone components and their evolution on two popular open-source projects, Chromium web browser and ProFTPD. First, we construct and compare two prediction models: Random Forest (RF) and Support Vector Machine (SVM) for classifying components as “High Eor “Low Efault prone. Second, we analyze the evolution of the fault proneness of 22 components of Chromium in 42 releases and 12 components of ProFTPD in 15 releases. Chromium has a median of 3.9k commits per release with a standard deviation of 1k while ProFTPD has 578 commits per release with a standard deviation of 654. We consider the total churns, bug-fix commits, bug-fix churns, rush period changes, number of developers, and number of files modified as the measures to construct our prediction models. Our models are able to successfully predict the fault-proneness of the components. The Random Forest outperforms the SVM with an accuracy of 96% and 95% for Chromium and ProFTPD respectively. We found that the majority of the Chromium components are high fault-prone. For ProFTPD, components are found to be high and low fault-prone interchangeably. The fault proneness of the components has evolved over the releases where “Chromecast Ein Chromium seems to be gradually turning into a low fault-prone component.
Thu 8 DecDisplayed time zone: Osaka, Sapporo, Tokyo change
15:00 - 16:30 | Empirical Studies 2Technical Track at Room2 Chair(s): Yusuf Sulistyo Nugroho Universitas Muhammadiyah Surakarta | ||
15:00 20mPaper | Exploring Activity and Contributors on GitHub: Who, What, When, and Where Technical Track Xiaoya Xia East China Normal University, Zhenjie Weng East China Normal University, will wang , Shengyu Zhao Tongji University | ||
15:20 20mPaper | The Language of Programming: On the Vocabulary of Names Technical Track | ||
15:40 20mPaper | An Empirical Study of Predicting Fault-prone Components and their Evolution Technical Track | ||
16:00 20mPaper | Empirical Study of Co-Renamed Identifiers Technical Track Yuki Osumi Tokyo Institute of Technology, Naotaka Umekawa Tokyo Institute of Technology, Hitomi Komata Tokyo Institute of Technology, Shinpei Hayashi Tokyo Institute of Technology DOI Pre-print |