DeFlaker: Automatically Detecting Flaky Tests (* ICSE 2018 * - Technical Papers )

Sun 27 May - Sun 3 June 2018 Gothenburg, Sweden

Who

Jonathan Bell, Owolabi Legunsen, Michael Hilton, Lamyaa Eloussi, Tifany Yung, Darko Marinov

Track

* ICSE 2018 * Technical Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 31 May 2018 11:00 - 11:20 at Congress Hall - Test Improvement Chair(s): Yves Le Traon

Abstract

Developers often run tests to check that their latest changes to a code repository did not break any previously working functionality. Ideally, any new test failures would indicate regressions caused by the latest changes. However, some test failures may not be due to the latest changes but due to non-determinism in the tests, popularly called flaky tests. The typical way to detect flaky tests is to rerun failing tests repeatedly. Unfortunately, rerunning failing tests can be costly and can slow down the development cycle.

We present the first extensive evaluation of rerunning failing tests and propose a new technique, called DeFlaker, that detects if a test failure is due to a flaky test without rerunning and with very low runtime overhead. DeFlaker monitors the coverage of latest code changes and marks as flaky any newly failing test that did not execute any of the changes. We deployed DeFlaker live, in the build process of 96 Java projects on TravisCI, and found 87 previously unknown flaky tests in 10 of these projects. We also ran experiments on project histories, where DeFlaker detected 1,874 flaky tests from 4,846 failures, with a low false alarm rate (1.5%). DeFlaker had a higher recall (95.5% vs. 23%) of confirmed flaky tests than Maven’s default flaky test detector.

Link to Preprint

http://www.jonbell.net/icse18-deflaker.pdf

Jonathan Bell

George Mason University

United States

Owolabi Legunsen

University of Illinois at Urbana-Champaign

United States

Michael Hilton

Carnegie Mellon University, USA

United States

Lamyaa Eloussi

Tifany Yung

Darko Marinov

University of Illinois at Urbana-Champaign

United States

Slides

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 31 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30	Test ImprovementTechnical Papers at Congress Hall Chair(s): Yves Le Traon University of Luxembourg

11:00 20m Talk		DeFlaker: Automatically Detecting Flaky Tests Technical Papers Jonathan Bell George Mason University, Owolabi Legunsen University of Illinois at Urbana-Champaign, Michael Hilton Carnegie Mellon University, USA, Lamyaa Eloussi , Tifany Yung , Darko Marinov University of Illinois at Urbana-Champaign Pre-print Media Attached
11:20 20m Talk		DetReduce: Minimizing Android GUI Test Suites for Regression Testing Technical Papers Wontae Choi , Koushik Sen University of California, Berkeley, George Necula University of California, Berkeley, Wenyu Wang University of Illinois at Urbana-Champaign
11:40 20m Talk		Time to Clean your Test Objectives Technical Papers Michaël Marcozzi Imperial College London, Sébastien Bardin , Nikolai Kosmatov , Mike Papadakis University of Luxembourg, Virgile Prevosto , Loïc Correnson Link to publication DOI File Attached
12:00 20m Talk		Prioritizing Browser Environments for Web Application Test Execution Technical Papers Junghyun Kwon , In-Young Ko Korea Advanced Institute of Science and Technology, Gregg Rothermel University of Nebraska - Lincoln
12:20 10m Talk		Q&A in groups Technical Papers