Identifying Test-Suite-Overfitted Patches through Test Case Generation
A typical automatic program repair technique that uses a test suite as the correct criterion can produce a patched program that is test-suite-overfitted, or overfitting, which passes the test suite but does not actually repair the bug. In this paper, we propose DiffTGen which identifies a patched program to be overfitting by first generating new test inputs that uncover semantic differences between the original faulty program and the patched program, then testing the patched program based on the semantic differences, and finally generating test cases. Such a test case could be added to the original test suite to make it stronger and could prevent the repair technique from generating a similar overfitting patch again. We evaluated DiffTGen on 89 patches generated by four automatic repair techniques for Java with 79 of them being likely to be overfitting and incorrect. DiffTGen identifies in total 39 (49.4%) overfitting patches and yields the corresponding test cases. We further show that an automatic repair technique, if configured with DiffTGen, could avoid yielding overfitting patches and potentially produce correct ones.
Wed 12 Jul
|10:30 - 10:55|
|10:55 - 11:20|
Anil KoyuncuUniversity of Luxembourg, Luxembourg, Tegawendé F. BissyandéUniversity of Luxembourg, Luxembourg, Dongsun KimUniversity of Luxembourg, Jacques KleinUniversity of Luxembourg, Martin Monperrus, Yves Le TraonUniversity of LuxembourgDOI
|11:20 - 11:45|
Sonal MahajanUniversity of Southern California, USA, Abdulmajeed AlameerUniversity of Southern California, USA, Phil McMinnUniversity of Sheffield, William G.J. HalfondUniversity of Southern CaliforniaDOI