SCAM 2025
Sun 7 - Fri 12 September 2025 Auckland, New Zealand
co-located with ICSME 2025
Tue 9 Sep 2025 13:50 - 14:10 at OGGB5 260-051 - Analysis 3 Chair(s): Coen De Roover

Bugs are essential in software engineering; many research studies in the past decades have been proposed to detect, localize, and repair bugs in software systems. Effectiveness evaluation of such techniques requires complex bugs, i.e., those that are hard to detect through testing and hard to repair through debugging. From the classic software engineering point of view, a hard-to-repair bug differs from the correct code in multiple locations, making it hard to localize and repair. Hard-to-detect bugs, on the other hand, manifest themselves under specific test inputs and reachability conditions. These two objectives, i.e., generating hard-to-detect and hard-to-repair bugs, are mostly aligned; a bug generation technique can change multiple statements to be covered only under a specific set of inputs. However, these two objectives conflict in the learning-based techniques: A bug should have a similar code representation to the correct code in the training data to challenge a bug prediction model to distinguish them. The hard-to-repair bug definition remains the same but with a caveat: the more a bug differs from the original code (at multiple locations), the more distant their representations are and easier to detect. This demands new techniques to generate bugs to complement existing bug datasets to challenge learning-based bug prediction and repair techniques. We propose BugFarm to transform arbitrary code into multiple hard-to-detect and hard-to-repair bugs. BugFarm mutates code in multiple locations (hard-to-repair) but leverages attention analysis to only change the least attended locations by the underlying model (hard-to-detect). Our comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BugFarm and two alternative approaches demonstrates our superiority in generating bugs that are hard to detect by learning-based bug prediction approaches (up to 40.53% higher False Negative Rate and 10.76%, 5.2%, 28.93%, and 20.53% lower Accuracy, Precision, Recall, and F1 score) and hard to repair by state-of-the-art learning-based program repair technique (28% repair success rate compared to 36% and 49% of LEAM and μBERT bugs). BugFarm is efficient, i.e., it takes nine seconds to mutate a code with no training overhead.

Tue 9 Sep

Displayed time zone: Auckland, Wellington change

13:30 - 14:30
Analysis 3Research Track at OGGB5 260-051
Chair(s): Coen De Roover Vrije Universiteit Brussel
13:30
20m
Research paper
Configurable Ensembles for Software Similarity: Challenging the Notion of Universal Metrics
Research Track
Shujun Huang Software Engineering Research Group (SERG), TU Delft, Sebastian Proksch Delft University of Technology
Pre-print
13:50
20m
Research paper
Challenging Bug Prediction and Repair Models with Synthetic Bugs
Research Track
Ali Reza Ibrahimzada University of Illinois Urbana-Champaign, Yang Chen University of Illinois at Urbana-Champaign, Ryan Rong Stanford University, Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign
Pre-print Media Attached
14:10
20m
Research paper
Plaintext in the Wild: Investigating Secure Connection Label Accuracy for Android Apps
Research Track
Yusei Sakuraba Okayama University, Hiroki Inayoshi Okayama University, Shoichi Saito Nagoya Institute of Technology, Akito Monden Okayama University
File Attached