Studying and Understanding the Effectiveness and Failures of Conversational LLM-Based Repair (APR 2025)

Who

Aolin Chen, Haojun Wu, Qi Xin, Steven P. Reiss, Jifeng Xuan

Track

APR 2025 Automated Program Repair

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 29 Apr 2025 16:20 - 16:40 at 210 - APR Session 4 Chair(s): Tegawendé F. Bissyandé, Chao Peng

Abstract

Automated program repair is designed to automate the process of bug-fixing. In recent years, thanks to the rapid development of large language models (LLMs), automated repair has achieved remarkable progress. Advanced APR techniques powered by conversational LLMs, most notably ChatGPT, have exhibited impressive repair abilities and gained increasing popularity due to the capabilities of the underlying LLMs in providing repair feedback and performing iterative patch improvement. Despite the superiority, conversational APR techniques can still fail to repair a large number of bugs. For example, a state-of-the-art conversational technique CHATREPAIR does not correctly repair over half of the single-function bugs in the Defects4J dataset. To understand the effectiveness and failures of conversational LLM-based repair and provide possible directions for improvement, we studied the exemplary CHATREPAIR with a focus on comparing the effectiveness of its cloze-style and full-function repair strategies, assessing its key iterative component for patch improvement, and analyzing the repair failures. Our study has led to a series of findings, which we believe provide key implications for future research.

Aolin Chen

Wuhan University

Haojun Wu

Wuhan University

Qi Xin

Wuhan University

China

Steven P. Reiss

Brown University

Jifeng Xuan

Wuhan University

China

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 29 Apr
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30	APR Session 4APR at 210 Chair(s): Tegawendé F. Bissyandé University of Luxembourg, Chao Peng ByteDance

16:00 20m Talk		Simple Fault Localization using Execution Traces APR Julian Prenner Free University of Bozen-Bolzano, Romain Robbes CNRS, LaBRI, University of Bordeaux
16:20 20m Talk		Studying and Understanding the Effectiveness and Failures of Conversational LLM-Based Repair APR Aolin Chen Wuhan University, Haojun Wu Wuhan University, Qi Xin Wuhan University, Steven P. Reiss Brown University, Jifeng Xuan Wuhan University
16:40 20m Talk		Towards Unveiling Vulnerability Remediation Tactics from OSS Community APR Lyuye Zhang Nanyang Technological University, Wu Jiahui , Chengwei Liu Nanyang Technological University, Kaixuan Li East China Normal University, Sen Chen Nankai University, Yang Liu Nanyang Technological University
17:00 20m Talk		Which Inputs Trigger my Patch? APR Martin Eberlein Humboldt-Universtität zu Berlin, Moeketsi Raselimo Humboldt-Universität zu Berlin, Germany and Stellenbosch University, South Africa, Lars Grunske Humboldt-Universität zu Berlin