ICST 2023
Sun 16 - Thu 20 April 2023 Dublin, Ireland
Wed 19 Apr 2023 14:00 - 14:20 at Pearse suite - Session 15: Flaky Tests Chair(s): John Micco

Flaky tests are test cases that can pass or fail without code changes. They often waste the time of software developers and obstruct the use of continuous integration. Previous work has presented several automated techniques for detecting flaky tests, though many involve repeated test executions and a lot of source code instrumentation and thus may be both intrusive and expensive. While this motivates researchers to evaluate machine learning models for detecting flaky tests, prior work on the features used to encode a test case is limited. Without further study of this topic, machine learning models cannot perform to their full potential in this domain. Previous studies also exclude a specific, yet prevalent and problematic, category of flaky tests: order-dependent (OD) flaky tests. This means that prior research only addresses part of the challenge of detecting flaky tests with machine learning. Closing this knowledge gap, this paper presents a new feature set for encoding tests, called Flake16. Using 54 distinct pipelines of data preprocessing, data balancing, and machine learning models for detecting both non-order-dependent (NOD) and OD flaky tests, this paper compares Flake16 to another well-established feature set. To assess the new feature set’s effectiveness, this paper’s experiments use the test suites of 26 Python projects, consisting of over 67,000 tests. Along with identifying the most impactful metrics for using machine learning to detect both types of flaky test, the empirical study shows how Flake16 is better than prior work, including (1) a 13% increase in overall F1 score when detecting NOD flaky tests and (2) a 17% increase in overall F1 score when detecting OD flaky tests.

Wed 19 Apr

Displayed time zone: Dublin change

14:00 - 15:40
Session 15: Flaky TestsPrevious Editions / Research Papers at Pearse suite
Chair(s): John Micco VMware
14:00
20m
Talk
Evaluating Features for Machine Learning Detection of Order- and Non-Order-Dependent Flaky Tests
Previous Editions
Owain Parry The University of Sheffield, Gregory Kapfhammer Allegheny College, Michael Hilton Carnegie Mellon University, Phil McMinn University of Sheffield
DOI
14:20
20m
Talk
An Empirical Study of Flaky Tests in Python
Previous Editions
Martin Gruber BMW Group, University of Passau, Stephan Lukasczyk University of Passau, Florian Kroiß , Gordon Fraser University of Passau
DOI
14:40
20m
Talk
A Survey on How Test Flakiness Affects Developers and What Support They Need To Address It
Previous Editions
Martin Gruber BMW Group, University of Passau, Gordon Fraser University of Passau
DOI
15:00
20m
Talk
Practical Flaky Test Prediction using Common Code Evolution and Test History Data
Research Papers
Martin Gruber BMW Group, University of Passau, Michael Heine BMW Group; Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Programming Systems Group, Norbert Oster Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Programming Systems Group, Michael Philippsen Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Programming Systems Group, Gordon Fraser University of Passau
Pre-print
15:20
20m
Talk
A Qualitative Study on the Sources, Impacts, and Mitigation Strategies of Flaky Tests
Previous Editions
Sarra Habchi Ubisoft, Guillaume Haben University of Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
DOI