Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods (ICST 2023 - Previous Editions)

Who

Amirfarhad Nilizadeh, Gary T. Leavens, Xuan Bach D. Le, Corina S. Păsăreanu, David Cok

Track

ICST 2023 Previous Editions

Time Zone

The program is currently displayed in (GMT+01:00) Dublin.

Use conference time zone: (GMT+01:00) DublinSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 18 Apr 2023 14:00 - 14:20 at Grand canal - Session 10: Program Repair Chair(s): Gunel Jahangirova

Abstract

Automated program repair (APR) techniques have shown a promising ability to generate patches that fix program bugs automatically. Typically such APR tools are dynamic in the sense that they find bugs by testing and they validate patches by running a program’s test suite. Patches can also be validated manually. However, neither of these methods for validating patches can truly tell whether a patch is correct. Test suites are usually incomplete, and thus APR-generated patches may pass the tests but not be truly correct; in other words, the APR tools may be overfitting to the tests. The possibility of test overfitting leads to manual validation, which is costly, potentially biased, and can also be incomplete. Therefore, we must move past these methods to truly assess APR’s overfitting problem.We aim to evaluate the test overfitting problem in dynamic APR tools using ground truth given by a set of programs equipped with formal behavioral specifications. Using these formal specifications and an automated verification tool, we found that there is definitely overfitting in the generated patches of seven well-studied APR tools, although many (about 59%) of the generated patches were indeed correct. Our study further points out two new problems that can affect APR tools: changes to the complexity of programs and numeric problems. An additional contribution is that we introduce the first publicly available data set of formally specified and verified Java programs, their test suites, and buggy variants, each of which has exactly one bug.

DOI

https://doi.org/10.1109/ICST49551.2021.00033

Amirfarhad Nilizadeh

University of Central Florida

United States

Gary T. Leavens

University of Central Florida

United States

Xuan Bach D. Le

The University of Melbourne

Australia

Corina S. Păsăreanu

Carnegie Mellon University

United States

David Cok

Safer Software Consulting, LLC

Time Zone

The program is currently displayed in (GMT+01:00) Dublin.

Use conference time zone: (GMT+01:00) DublinSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 18 Apr
Displayed time zone: Dublin change

14:00 - 15:30	Session 10: Program RepairResearch Papers / Previous Editions / Posters at Grand canal Chair(s): Gunel Jahangirova USI Lugano, Switzerland

14:00 20m Talk		Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods Previous Editions Amirfarhad Nilizadeh University of Central Florida, Gary T. Leavens University of Central Florida, Xuan Bach D. Le The University of Melbourne, Corina S. Păsăreanu Carnegie Mellon University, David Cok Safer Software Consulting, LLC DOI
14:20 20m Talk		Embedding Context as Code Dependencies for Neural Program Repair Research Papers Noor Nashid University of British Columbia, Mifta Sintaha University of British Columbia, Ali Mesbah University of British Columbia (UBC)
14:40 20m Talk		CorCA: An Automatic Program Repair Tool for Checking and Removing Effectively C Flaws Research Papers João Inácio LASIGE, Faculdade de Ciências da Universidade de Lisboa, Ibéria Medeiros LaSIGE, Faculdade de Ciências da Universidade de Lisboa
15:00 20m Talk		Set the right example when teaching programming: Test Informed Learning with Examples (TILE) Research Papers Niels Doorn Open Universiteit and NHL Stenden University of Applied Sciences, Tanja E. J. Vos Universitat Politècnica de València and Open Universiteit, Beatriz Marín Universitat Politècnica de València, Erik Barendsen Open Universiteit
15:20 5m Talk		Poster: Software Fault Localization as a Service (SFLaaS) Posters Qusay Idrees Sarhan Department of Software Engineering, University of Szeged, Hassan Bapeer Hassan University of Duhok, Árpád Beszédes Department of Software Engineering, University of Szeged
15:25 5m Talk		Poster: Improving Spectrum Based Fault Localization For Python Programs Using Weighted Code Elements Posters Qusay Idrees Sarhan Department of Software Engineering, University of Szeged, Árpád Beszédes Department of Software Engineering, University of Szeged