Regression Test History Data for Flaky Test Research
Due to their random nature, flaky test failures are difficult to study. Without having observed a test to both pass and fail under the same setup, it is unknown whether a test is flaky and what its failure rate is. Thus, flaky-test research has greatly benefited from data records of previous studies, which provide evidence for flaky test failures and give a rough indication of the failure rates to expect. For assessing the impact of the studied flaky tests on developers’ work, it is important to also know how flaky test failures manifest over a regression test history, i.e., under continuous changes to test code or code under test. While existing datasets on flaky tests are mostly based on re-runs on an invariant code base, the actual effects of flaky tests on development can only be assessed across the commits in an evolving commit history, against which (potentially flaky) regression tests are executed. In our presentation, we outline approaches to bridge this gap and report on our experiences following one of them. As a result of this work, we contribute a dataset of flaky test failures across a simulated regression test history.
Slides (ftw2024-slides_regression-test-history-data.html) | 3.79MiB |
Sun 14 AprDisplayed time zone: Lisbon change
11:00 - 12:30 | |||
11:00 30mPaper | Presubmit Rescue: Automatically Ignoring FlakyTest Executions FTW | ||
11:30 30mPaper | Regression Test History Data for Flaky Test Research FTW File Attached | ||
12:00 30mPaper | Predicting the Lifetime of Flaky Tests on Chrome FTW |