The Impact of Flaky Tests on Historical Test Prioritization on Chrome
Fri 13 May 2022 13:10 - 13:15 at ICSE room 3-odd hours - Mining Software Repositories 7 Chair(s): Grace Lewis
Test prioritization algorithms prioritize probable failing tests to give faster feedback to developers in case a failure occurs. Test prioritization approaches that use historical failures to run tests that have failed in the past may be susceptible to flaky tests as these tests often fail and then pass without identifying a fault. Traditionally, flaky failures like other types of failures are considered blocking, i.e. a test that needs to be investigated before the code can move to the next stage. However, on Google Chrome, flaky failures are non-blocking and the code still moves to the next stage in the CI pipeline. In this work, we explain the Chrome testing pipeline and classification. Then, we re-implement two important history based test prioritization algorithms and evaluate them on over 276 million test runs from the Chrome project. We apply these algorithms in two scenarios. First, we consider flaky failures as blocking and then, we use Chrome’s approach and consider flaky failures as non-blocking.
Our investigation reveals that 99.58% of all failures are flaky. These types of failures are much more repetitive than non-flaky failures, and they are also well distributed over time. We conclude that the prior performance of the prioritization algorithms have been inflated by flaky failures. We release our data and scripts in our replication package.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
Fri 13 MayDisplayed time zone: Eastern Time (US & Canada) change
13:00 - 14:00 | Mining Software Repositories 7SEIP - Software Engineering in Practice / Journal-First Papers at ICSE room 3-odd hours Chair(s): Grace Lewis Carnegie Mellon Software Engineering Institute | ||
13:00 5mTalk | Dependency Smells in JavaScript Projects Journal-First Papers Abbas Javan Jafari Concordia University, Canada, Diego Costa Concordia University, Canada, Rabe Abdalkareem Carleton University, Emad Shihab Concordia University, Nikolaos Tsantalis Concordia University DOI Pre-print File Attached | ||
13:05 5mTalk | Mining Idioms in the Wild SEIP - Software Engineering in Practice Aishwarya Sivaraman University of California, Los Angeles, Rui Abreu Faculty of Engineering, University of Porto, Portugal, Andrew Scott Facebook, Tobi Akomolede Facebook, Satish Chandra Facebook Pre-print Media Attached | ||
13:10 5mTalk | The Impact of Flaky Tests on Historical Test Prioritization on Chrome SEIP - Software Engineering in Practice Pre-print Media Attached |