Real World Projects, Real Faults: Evaluating Spectrum Based Fault Localization Techniques on Python Projects
Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those techniques were done only on Java and C programs, and frequently involve artificial faults. Considering the current popularity of Python, indicated by the results of the Stack Overflow survey among developers in 2020, it becomes increasingly important to understand how SBFL techniques perform on Python projects. However, this remains an understudied topic. In this work, our objective is to analyze the effectiveness of popular SBFL techniques in real-world Python projects. We also aim to compare our observed performance on Python to previously-reported performance on Java. Using the recently-built bug benchmark BugsInPy as our fault dataset, we apply five popular SBFL techniques (Tarantula, Ochiai, OP, Barinel, and DStar) and analyze their performances. We subsequently compare our results with results from Java and C projects reported in earlier related works.
We find that 1) the real faults in BugsInPy are harder to identify using SBFL techniques compared to the real faults in Defects4J, indicated by the lower performance of the evaluated SBFL techniques on BugsInPy; 2) older techniques such as Tarantula, Barinel, and Ochiai consistently outperform newer techniques (i.e., OP and DStar) in a variety of metrics and debugging scenarios; 3) claims in preceding studies done on artificial faults in C and Java (such as ``OP outperforms Tarantula") do not hold on Python real faults; 4) lower-performing techniques can outperform higher-performing techniques in some cases, emphasizing the potential benefit of combining SBFL techniques. Our results yield insight into how popular SBFL techniques perform in real Python faults and emphasize the importance of conducting SBFL evaluations on real faults.
Wed 17 MayDisplayed time zone: Hobart change
11:00 - 12:30 | Fault localizationJournal-First Papers / Technical Track / Showcase at Meeting Room 103 Chair(s): Rui Abreu University of Porto | ||
11:00 15mTalk | Evaluating the Impact of Experimental Assumptions in Automated Fault Localization Technical Track Ezekiel Soremekun Royal Holloway, University of London, Lukas Kirschner Saarland University, Marcel Böhme MPI-SP, Germany and Monash University, Australia, Mike Papadakis University of Luxembourg, Luxembourg Pre-print Media Attached | ||
11:15 15mTalk | Locating Framework-specific Crashing Faults with Compact and Explainable Candidate Set Technical Track Jiwei Yan Institute of Software at Chinese Academy of Sciences, China, MiaoMiao Wang Technology Center of Software Engineering, ISCAS, China. University of Chinese Academy of Sciences, China., Yepang Liu Southern University of Science and Technology, Jun Yan Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Long Zhang Institute of Software, Chinese Academy of Sciences Pre-print | ||
11:30 15mTalk | PExReport: Automatic Creation of Pruned Executable Cross-Project Failure Reports Technical Track Pre-print Media Attached | ||
11:45 15mTalk | Bug localization in game software engineering: evolving simulations to locate bugs in software models of video games Showcase Rodrigo Casamayor SVIT Research Group. Universidad San Jorge, Lorena Arcega San Jorge University, Francisca Pérez SVIT Research Group, Universidad San Jorge, Carlos Cetina San Jorge University, Spain DOI | ||
12:00 7mTalk | Real World Projects, Real Faults: Evaluating Spectrum Based Fault Localization Techniques on Python Projects Journal-First Papers Ratnadira Widyasari Singapore Management University, Singapore, Gede Artha Azriadi Prana Singapore Management University, Stefanus Agus Haryono Singapore Management University, Shaowei Wang University of Manitoba, David Lo Singapore Management University | ||
12:07 7mTalk | Effective Isolation of Fault-Correlated Variables via Statistical and Mutation Analysis Journal-First Papers Ming Wen Huazhong University of Science and Technology, Zifan Xie Huazhong University of Science and Technology, Kaixuan Luo Huazhong University of Science and Technology, Xiao Chen Huazhong University of Science and Technology, Yibiao Yang Nanjing University, Hai Jin Huazhong University of Science and Technology | ||
12:15 15mTalk | RAT: A Refactoring-Aware Traceability Model for Bug Localization Technical Track Feifei Niu University of Ottawa, Wesley Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Liguo Huang Southern Methodist University, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Jidong Ge Nanjing University, Bin Luo Nanjing University, Alexander Egyed Johannes Kepler University Linz File Attached |