Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features (APR 2022)

Who

Gareth Bennett, Tracy Hall, David Bowes

Track

APR 2022

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 19 May 2022 10:45 - 10:52 at APR room - Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features

Abstract

Defects4J is a popular dataset against which many Java Automatic Program Repair (APR) tools benchmark their performance. However, recent evidence suggests that some APR tools overfit to Defects4J, producing plausible patches which are incorrect. What we do not currently know is whether there is any commonality in the features of these plausible patches that turn out not to be correct. We compare the features of Defects4J’s human written patches in terms of those correctly patched by existing APR tools and those incorrectly patched. We found that 48.4% of Defects4J v1.5 have been automatically patched by existing APR tools; of which only 28.9% have been correctly patched leaving 19.5% incorrectly patched. We found that the human written patches of defects incorrectly patched by APR tools were twice the size of those that have been correctly patched. We also found patches of defects that added a method call, added a variable, or wrapped existing code with new code, such as a \texttt{try/catch} block were significantly associated with incorrect patches. Editing only a single line was significantly associated with correct patches. Our results suggest that current tools are weak at generating multi-line patches and synthesising new code especially when wrapping existing code. Our results highlight potential future areas of development for new APR approaches.

Gareth Bennett

Lancaster University

Tracy Hall

Lancaster University

David Bowes

Lancaster University

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 19 May
Displayed time zone: Eastern Time (US & Canada) change

10:45 - 11:00	Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch FeaturesAPR at APR room

10:45 7m Talk		Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features APR Gareth Bennett Lancaster University, Tracy Hall Lancaster University, David Bowes Lancaster University
10:52 7m Live Q&A		Q&A APR

Information for Participants

Thu 19 May 2022 10:45 - 11:00 at APR room - Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features

Info for room APR room:

Click here to go to the room on Midspace