Challenging Bug Prediction and Repair Models with Synthetic Bugs (SCAM 2025 - Research Track)

Who

Ali Reza Ibrahimzada, Yang Chen, Ryan Rong, Reyhaneh Jabbarvand

Track

SCAM 2025 Research Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 9 Sep 2025 13:50 - 14:10 at OGGB5 260-051 - Analysis 3 Chair(s): Coen De Roover

Abstract

Bugs are essential in software engineering; many research studies in the past decades have been proposed to detect, localize, and repair bugs in software systems. Effectiveness evaluation of such techniques requires complex bugs, i.e., those that are hard to detect through testing and hard to repair through debugging. From the classic software engineering point of view, a hard-to-repair bug differs from the correct code in multiple locations, making it hard to localize and repair. Hard-to-detect bugs, on the other hand, manifest themselves under specific test inputs and reachability conditions. These two objectives, i.e., generating hard-to-detect and hard-to-repair bugs, are mostly aligned; a bug generation technique can change multiple statements to be covered only under a specific set of inputs. However, these two objectives conflict in the learning-based techniques: A bug should have a similar code representation to the correct code in the training data to challenge a bug prediction model to distinguish them. The hard-to-repair bug definition remains the same but with a caveat: the more a bug differs from the original code (at multiple locations), the more distant their representations are and easier to detect. This demands new techniques to generate bugs to complement existing bug datasets to challenge learning-based bug prediction and repair techniques. We propose BugFarm to transform arbitrary code into multiple hard-to-detect and hard-to-repair bugs. BugFarm mutates code in multiple locations (hard-to-repair) but leverages attention analysis to only change the least attended locations by the underlying model (hard-to-detect). Our comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BugFarm and two alternative approaches demonstrates our superiority in generating bugs that are hard to detect by learning-based bug prediction approaches (up to 40.53% higher False Negative Rate and 10.76%, 5.2%, 28.93%, and 20.53% lower Accuracy, Precision, Recall, and F1 score) and hard to repair by state-of-the-art learning-based program repair technique (28% repair success rate compared to 36% and 49% of LEAM and μBERT bugs). BugFarm is efficient, i.e., it takes nine seconds to mutate a code with no training overhead.

Link to Preprint

https://arxiv.org/abs/2310.02407

DOI

https://doi.org/10.1109/SCAM67354.2025.00021

Ali Reza Ibrahimzada

University of Illinois Urbana-Champaign

United States

Yang Chen

University of Illinois at Urbana-Champaign

United States

Ryan Rong

Stanford University

United States

Reyhaneh Jabbarvand

University of Illinois at Urbana-Champaign

United States

Artifacts

Slides

Challenging Bug Prediction and Repair Models with Synthetic Bugs | SCAM 2025 | New Zealand

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 9 Sep
Displayed time zone: Auckland, Wellington change

13:30 - 14:30	Analysis 3Research Track at OGGB5 260-051 Chair(s): Coen De Roover Vrije Universiteit Brussel

13:30 20m Research paper		Configurable Ensembles for Software Similarity: Challenging the Notion of Universal Metrics Research Track Shujun Huang Software Engineering Research Group (SERG), TU Delft, Sebastian Proksch Delft University of Technology Pre-print
13:50 20m Research paper		Challenging Bug Prediction and Repair Models with Synthetic Bugs Research Track Ali Reza Ibrahimzada University of Illinois Urbana-Champaign, Yang Chen University of Illinois at Urbana-Champaign, Ryan Rong Stanford University, Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign DOI Pre-print Media Attached
14:10 20m Research paper		Plaintext in the Wild: Investigating Secure Connection Label Accuracy for Android Apps Research Track Yusei Sakuraba Okayama University, Hiroki Inayoshi Okayama University, Shoichi Saito Nagoya Institute of Technology, Akito Monden Okayama University File Attached