A Study of Flaky Failure De-Duplication to Identify Unreliably Killed Mutants
Mutation testing serves as a crucial technique for evaluating the efficacy of a test suite in identifying faults. However, the presence of non-deterministic behavior, particularly flaky tests, causes a significant challenge to the confidence in mutation testing outcomes. Flaky tests introduce uncertainty in linking the cause behind killed mutants which affect the trust of the assessment. When a flaky test “kills” a mutant, does it reliably kill it, or does it only incidentally kill it due to flakiness? This study is the first to directly examine the impact of flaky tests on killing mutants, underscoring how flaky tests can yield unreliable mutation results. Our analysis of 22 Java projects, previously examined for test flakiness, reveals that 19% of mutants killed by these flaky tests result from the flakiness introduced by at least one test. We examined the efficacy of failure de-duplication approaches in distinguishing mutants that were reliably killed from those that were not. We show that this approach effectively approximates the true number of mutants reliably killed, but with far less computational cost than re-running each test for each mutant.
Tue 28 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 30mTalk | Test Harness Mutilation Mutation Samuel Moelius Trail of Bits | ||
14:30 30mTalk | An Empirical Evaluation of Manually Created Equivalent Mutants Mutation Philipp Straubinger University of Passau, Alexander Degenhart University of Passau, Gordon Fraser University of Passau Pre-print | ||
15:00 30mTalk | A Study of Flaky Failure De-Duplication to Identify Unreliably Killed Mutants Mutation Abdulrahman Alshammari George Mason University, Paul Ammann George Mason University, USA, Michael Hilton Carnegie Mellon University, Jonathan Bell Northeastern University |