Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs? (ASE 2022 - Research Papers)

Who

Cedric Richter, Jan Haltermann, Marie-Christine Jakobs, Felix Pauck, Stefan Schott, Heike Wehrheim

Track

ASE 2022 Research Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 13 Oct 2022 16:00 - 16:20 at Ballroom C East - Technical Session 29 - AI for SE II Chair(s): Tim Menzies

Abstract

Debugging, that is, identifying and fixing bugs in software, is a central part of software development. Developers are therefore often confronted with the task of deciding whether a given code snippet contains a bug, and if yes, where. Recently, data-driven methods have been employed to learn this task of bug detection, resulting (amongst others) in so called neural bug detectors. Neural bug detectors are trained on millions of buggy and correct code snippets.

Given the “neural learning” procedure, it seems likely that neural bug detectors – on the specific task of finding bugs – have a performance similar to human software developers. For this work, we set out to substantiate or refute such a hypothesis. We report on the results of an empirical study with over 100 software developers, targeting the comparison of humans and neural bug detectors. As detection task, we chose a specific form of bugs (variable misuse bugs) for which neural bug detectors have recently made significant progress.
Our study shows that despite the fact that neural bug detectors see millions of such examples during training, software developers – when conducting bug detection as a majority decision – are slightly better than neural bug detectors. Altogether, we find a large overlap in the performance, both for classifying code as buggy and for localizing the buggy line in the code.
In comparison to developers, one of the two evaluated neural bug detectors, however, raises a higher number of false alarms.

Link to Preprint

https://fpauck.de/papers/ase22-203.pdf

DOI

https://doi.org/10.1145/3551349.3561156

File attachments

Presentation (ASE-Comparable-Developers.pdf)	2.7MiB

Cedric Richter

University of Oldenburg

Germany

Jan Haltermann

University of Oldenburg

Marie-Christine Jakobs

Technical University of Darmstadt

Felix Pauck

Paderborn University, Germany

Germany

Stefan Schott

Paderborn University

Heike Wehrheim

University of Oldenburg

Survey Link (data will not be evaluated anymore)

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 13 Oct
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 18:00	Technical Session 29 - AI for SE IIResearch Papers / Journal-first Papers at Ballroom C East Chair(s): Tim Menzies North Carolina State University

16:00 20m Research paper		Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs? Research Papers Cedric Richter University of Oldenburg, Jan Haltermann University of Oldenburg, Marie-Christine Jakobs Technical University of Darmstadt, Felix Pauck Paderborn University, Germany, Stefan Schott Paderborn University, Heike Wehrheim University of Oldenburg DOI Pre-print Media Attached File Attached
16:20 20m Research paper		Learning Contract Invariants Using Reinforcement Learning Research Papers Junrui Liu University of California, Santa Barbara, Yanju Chen University of California at Santa Barbara, Bryan Tan Amazon Web Services, Işıl Dillig University of Texas at Austin, Yu Feng University of California at Santa Barbara
16:40 20m Research paper		Compressing Pre-trained Models of Code into 3 MB Research Papers Jieke Shi Singapore Management University, Zhou Yang Singapore Management University, Bowen Xu School of Information Systems, Singapore Management University, Hong Jin Kang Singapore Management University, Singapore, David Lo Singapore Management University DOI Pre-print Media Attached
17:00 20m Research paper		A Transferable Time Series Forecasting Service using Deep Transformer model for Online SystemsVirtual Research Papers Tao Huang Tencent, Pengfei Chen Sun Yat-Sen University, Jingrun Zhang School of Data and Computer Science, Sun Yat-sen University, Ruipeng Li Tencent, Rui Wang Tencent
17:20 20m Paper		The Weights can be Harmful: Pareto Search versus Weighted Search in Multi-Objective Search-Based Software EngineeringVirtual Journal-first Papers Tao Chen Loughborough University, Miqing Li University of Birmingham Pre-print
17:40 20m Research paper		Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering DatasetsVirtual Research Papers Zhong Li Nanjing, Minxue Pan Nanjing University, Yu Pei Hong Kong Polytechnic University, Tian Zhang Nanjing University, Linzhang Wang Nanjing University, Xuandong Li Nanjing University