Shaping Program Repair Space with Existing Patches and Similar CodeISSTA paper
Automated program repair (APR) has great potential to reduce bug-fixing effort and many promising approaches have been proposed in recent years. APRs are often treated as a search problem where the search space consists of all the possible patches and the goal is to identify the correct patch in the space. Current approaches usually take a data-driven approach to reduce search space and estimate the likelihood of patches. They commonly use existing patches and source code as data sources and have explored their effectiveness for automatic program repair. Especially, recent research revealed that many fix ingredients can be found in existing source code at a finer granularity.
In this paper, we propose a novel automatic program repair approach that takes the intersection between two search spaces obtained by analyzing 1) existing patches and 2) source code via a fine-grained code differencing algorithm. In this way, our approach reduces the search space and allows to adapt existing code snippet at finer granularity for patch generation. We have implemented our approach as a tool called SimFix, and evaluated it on the Defects4J benchmark. Our tool successfully fixed 34 bugs. To our best knowledge, this is the largest number of bugs fixed by a single technology on the Defects4J benchmark. Furthermore, as far as we know, 13 bugs fixed by our approach have never been fixed by the current approaches.