The Impact of Flaky Tests on Historical Test Prioritization on Chrome
Test prioritization algorithms prioritize probable failing tests to give faster feedback to developers in case a failure occurs. Test prioritization approaches that use historical failures to run tests that have failed in the past may be susceptible to flaky tests as these tests often fail and then pass without identifying a fault. Traditionally, flaky failures like other types of failures are considered blocking, i.e. a test that needs to be investigated before the code can move to the next stage. However, on Google Chrome, flaky failures are non-blocking and the code still moves to the next stage in the CI pipeline. In this work, we explain the Chrome testing pipeline and classification. Then, we re-implement two important history based test prioritization algorithms and evaluate them on over 276 million test runs from the Chrome project. We apply these algorithms in two scenarios. First, we consider flaky failures as blocking and then, we use Chrome’s approach and consider flaky failures as non-blocking.
Our investigation reveals that 99.58% of all failures are flaky. These types of failures are much more repetitive than non-flaky failures, and they are also well distributed over time. We conclude that the prior performance of the prioritization algorithms have been inflated by flaky failures. We release our data and scripts in our replication package.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
Fri 13 MayDisplayed time zone: Eastern Time (US & Canada) change
13:00 - 14:00
Abbas Javan Jafari Concordia University, Canada, Diego Costa Concordia University, Canada, Rabe Abdalkareem Carleton University, Emad Shihab Concordia University, Nikolaos Tsantalis Concordia UniversityDOI Pre-print File Attached
|Mining Idioms in the Wild|
SEIP - Software Engineering in Practice
Aishwarya Sivaraman University of California, Los Angeles, Rui Abreu Faculty of Engineering, University of Porto, Portugal, Andrew Scott Facebook, Tobi Akomolede Facebook, Satish Chandra FacebookPre-print Media Attached
|The Impact of Flaky Tests on Historical Test Prioritization on Chrome|
SEIP - Software Engineering in PracticePre-print Media Attached