Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs?
Debugging, that is, identifying and fixing bugs in software, is a central part of software development. Developers are therefore often confronted with the task of deciding whether a given code snippet contains a bug, and if yes, where. Recently, data-driven methods have been employed to learn this task of bug detection, resulting (amongst others) in so called neural bug detectors. Neural bug detectors are trained on millions of buggy and correct code snippets.
Given the “neural learning” procedure, it seems likely that neural bug detectors – on the specific task of finding bugs – have a performance similar to human software developers. For this work, we set out to substantiate or refute such a hypothesis. We report on the results of an empirical study with over 100 software developers, targeting the comparison of humans and neural bug detectors. As detection task, we chose a specific form of bugs (variable misuse bugs) for which neural bug detectors have recently made significant progress.
Our study shows that despite the fact that neural bug detectors see millions of such examples during training, software developers – when conducting bug detection as a majority decision – are slightly better than neural bug detectors. Altogether, we find a large overlap in the performance, both for classifying code as buggy and for localizing the buggy line in the code.
In comparison to developers, one of the two evaluated neural bug detectors, however, raises a higher number of false alarms.
Presentation (ASE-Comparable-Developers.pdf) | 2.7MiB |
Thu 13 OctDisplayed time zone: Eastern Time (US & Canada) change
16:00 - 18:00 | Technical Session 29 - AI for SE IIResearch Papers / Journal-first Papers at Ballroom C East Chair(s): Tim Menzies North Carolina State University | ||
16:00 20mResearch paper | Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs? Research Papers Cedric Richter University of Oldenburg, Jan Haltermann University of Oldenburg, Marie-Christine Jakobs Technical University of Darmstadt, Felix Pauck Paderborn University, Germany, Stefan Schott Paderborn University, Heike Wehrheim University of Oldenburg DOI Pre-print Media Attached File Attached | ||
16:20 20mResearch paper | Learning Contract Invariants Using Reinforcement Learning Research Papers Junrui Liu University of California, Santa Barbara, Yanju Chen University of California at Santa Barbara, Bryan Tan Amazon Web Services, Işıl Dillig University of Texas at Austin, Yu Feng University of California at Santa Barbara | ||
16:40 20mResearch paper | Compressing Pre-trained Models of Code into 3 MB Research Papers Jieke Shi Singapore Management University, Zhou Yang Singapore Management University, Bowen Xu School of Information Systems, Singapore Management University, Hong Jin Kang Singapore Management University, Singapore, David Lo Singapore Management University DOI Pre-print Media Attached | ||
17:00 20mResearch paper | A Transferable Time Series Forecasting Service using Deep Transformer model for Online SystemsVirtual Research Papers Tao Huang Tencent, Pengfei Chen Sun Yat-Sen University, Jingrun Zhang School of Data and Computer Science, Sun Yat-sen University, Ruipeng Li Tencent, Rui Wang Tencent | ||
17:20 20mPaper | The Weights can be Harmful: Pareto Search versus Weighted Search in Multi-Objective Search-Based Software EngineeringVirtual Journal-first Papers Pre-print | ||
17:40 20mResearch paper | Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering DatasetsVirtual Research Papers Zhong Li Nanjing, Minxue Pan Nanjing University, Yu Pei Hong Kong Polytechnic University, Tian Zhang Nanjing University, Linzhang Wang Nanjing University, Xuandong Li Nanjing University |