Comparing developer-provided to user-provided tests for fault localization and automated program repair
To realistically evaluate a software testing or debugging technique, the technique must be run on defects and tests that are characteristic of those a developer would encounter in practice. In general, this means that the evaluation should use real defects and tests.
For example, to determine the utility of a fault localization technique, it could be run on real defects from a bug tracking system, using real tests that are committed to the version control repository along with the fix. Although such a methodology uses real tests, it may not use tests that are representative of the information a developer or tool would have in practice. The tests that a developer commits after fixing a defect may encode more information than was available to the developer when initially diagnosing the defect. This calls into question the results of past empirical studies on the effectiveness of fault localization and automated program repair that used developer-provided tests.
This paper compares, both quantitatively and qualitatively, the developer-provided tests committed along with fixes (as found in the version control repository) versus the user-provided tests extracted from bug reports (as found in the issue tracker).
Our results provide evidence that developer-provided tests encode more information than available in user-provided tests. For fault localization, developer-provided tests consistently overestimate the ability to rank a defective statement in the list of the top-n most suspicious statements. For automated program repair, developer-provided tests overstate the effectiveness, because user-provided tests generate far fewer correct patches and substantially increase the repair time. We also provide a novel benchmark that contains tests extracted from bug reports, and we make suggestions for improving the design and evaluation of fault localization and automated program repair techniques.
Wed 18 JulDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:30 | Porting and RepairISSTA Technical Papers at Zurich II Chair(s): Julian Dolby IBM Thomas J. Watson Research Center | ||
11:00 20mTalk | Search-Based Detection of Deviation Failures in the Migration of Legacy Spreadsheet Applications ISSTA Technical Papers Mohammad M. Almasi University of Manitoba, Hadi Hemmati University of Calgary, Gordon Fraser University of Passau, Phil McMinn University of Sheffield, Janis Benefelds SEB Life and Pensions Holding AB | ||
11:20 20mTalk | Making Data-Driven Porting Decisions with Tuscan ISSTA Technical Papers Kareem Khazem University College London, Earl T. Barr University College London, Petr Hosek Google, Inc. | ||
11:40 20mTalk | Comparing developer-provided to user-provided tests for fault localization and automated program repair ISSTA Technical Papers René Just University of Massachusetts, USA, Chris Parnin NCSU, Ian Drosos University of California, San Diego, Michael D. Ernst University of Washington, USA | ||
12:00 20mTalk | Shaping Program Repair Space with Existing Patches and Similar Code ISSTA Technical Papers Jiajun Jiang Peking University, Yingfei Xiong Peking University, Hongyu Zhang The University of Newcastle, Qing Gao Peking University, Xiangqun Chen Peking University Pre-print | ||
12:20 10m | Q&A in groups ISSTA Technical Papers |