FORGE 2025
Sun 27 - Mon 28 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025

Deep learning and language models are increasingly dominating automated program repair research. While previous generate-and-validate approaches were able to find and use fix ingredients on a file or even project level, neural language models are limited to the code that fits their input window. In this work we investigate how important identifier ingredients are in neural program repair and present ScanFix, an approach that leverages an additional scanner model to extract identifiers from a bug’s file and potentially project-level context. We find that lack of knowledge of far-away identifiers is an important cause of failed repairs. Augmenting repair model input with scanner-extracted identifiers yields relative improvements of up to 31%. However, ScanFix is outperformed by a model with a large input window (> 5k tokens). When passing ingredients from the ground-truth fix, improvements are even higher. This shows that, with refined extraction techniques, ingredient scanning, similar to fix candidate ranking, could have the potential to become an important “subtask” of future automated repair systems. At the same time, it also demonstrates that this idea is subject to Sutton’s bitter lesson and may be rendered unnecessary by new code models with ever-increasing context windows.

Mon 28 Apr

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Session4: Human-AI Collaboration & Legal Aspects of using FMResearch Papers / Industry Papers at 207
Chair(s): Zhenhao Li York University
11:00
12m
Long-paper
Extracting Fix Ingredients using Language Models
Research Papers
Julian Prenner Free University of Bozen-Bolzano, Romain Robbes CNRS, LaBRI, University of Bordeaux
11:12
12m
Long-paper
CodeFlow: Program Behavior Prediction with Dynamic Dependencies Learning
Research Papers
Cuong Chi Le FPT Software AI Center, Hoang Nhat Phan Nanyang Technological University, Huy Nhat Phan FPT Software AI Center, Tien N. Nguyen University of Texas at Dallas, Nghi D. Q. Bui Salesforce Research
11:24
12m
Long-paper
Addressing Specific and Complex Scenarios in Semantic Parsing
Research Papers
Yu Wang Xi'an Jiaotong University, Ming Fan Xi'an Jiaotong University, Ting Liu Xi'an Jiaotong University
11:36
12m
Long-paper
Skill over Scale: The Case for Medium, Domain-Specific Models for SE
Research Papers
Manisha Mukherjee Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
Pre-print
11:48
12m
Long-paper
Resource-Efficient & Effective Code Summarization
Research Papers
Saima Afrin William & Mary, Joseph Call William & Mary, Khai Nguyen William & Mary, Oscar Chaparro William & Mary, Antonio Mastropaolo William and Mary, USA
12:00
6m
Short-paper
How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering
Research Papers
Christoph Treude Singapore Management University, Marco Gerosa Northern Arizona University
Pre-print
12:06
6m
Short-paper
"So what if I used GenAI?” - Legal Implications of Using GenAI in Software Engineering Research
Research Papers
Gouri Ginde (Deshpande) University of Calgary
Pre-print
12:12
6m
Short-paper
Evaluating the Ability of GPT-4o to Generate Verifiable Specifications in VeriFast
Research Papers
Marilyn Rego Purdue University, Wen Fan Purdue University, Xin Hu Univeristy of Michigan - Ann Arbor, Sanya Dod , Zhaorui Ni Purdue University, Danning Xie Purdue University, Jenna DiVincenzo (Wise) Purdue University, Lin Tan Purdue University
12:18
6m
Short-paper
Towards Generating App Feature Descriptions Automatically with LLMs: the Setapp Case Study
Industry Papers
:
:
:
: