Evaluating Test-Suite Reduction in Real-World Software Evolution
Test-suite reduction (TSR) speeds up regression testing by removing redundant tests from the test suite, running fewer tests on the future code changes. A developer considering TSR must consider the cost of potentially removing tests that can detect faults in the future. Furthermore, the developer also needs some way to predict how well the reduced test suite performs w.r.t. detecting faults in the future. Prior research evaluated this cost of TSR using faulty program versions, each constructed with a single seeded fault. Furthermore, it is unknown if any metric measured on the reduced test suite at the point of reduction is effective at predicting the missed faults.
In this paper, we perform the first extensive study on TSR using real test failures in (failed) builds that occurred for real code changes. We analyze 1478 failed builds from 32 GitHub projects that run their tests on the Travis continuous integration service. We compute reduced test suites on early, passing versions of each project. As there can be multiple faults in each failed build, we propose FFMap, a family of mappings from test failures to faults. We use these mappings to determine the percentage failed builds where the reduced test suite misses to detect all the faults called FBDL. We find that reduced test suites can miss up to 52.2% of failed builds. We find that FBDL of reduced test suites is higher than suggested by traditional TSR metrics. Furthermore, traditional TSR metrics are not good predictors of FBDL, and even our proposed usage of historical FBDL to predict future FBDL, while a better predictor than the traditional TSR metrics, is still not a good predictor. The lack of a good predictor makes it difficult for developers to decide whether or not to use reduced test suites. Our results raise important concerns about automated TSR.
Mon 16 Jul Times are displayed in time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
|14:00 - 14:20|
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems: A Multi-Objective Search-Based Approach
|14:20 - 14:40|
|14:40 - 15:00|
|15:00 - 15:20|
|15:20 - 15:30|