FTW 2024 - 1st International Flaky Tests Workshop 2024 (FTW 2024)

What is a flaky test?

Software developers rely on test cases to identify bugs in their code and to provide a signal as to their code’s correctness. Should such signals have a history of unreliability, they not only become less informative, but may also be considered untrustworthy. In the context of software testing, practitioners refer to these unreliable signals as flaky tests. The definition varies slightly, but a flaky test is generally defined as a test case that can pass and fail without changes to the test case code or the code under test.

Why are they such a big deal?

Concurrency and randomness are well-established causes among many others, though flakiness has far-reaching negative consequences regardless of origin. These consequences are felt by developers from small open-source projects to the likes of Google, Microsoft, and Meta. Flaky tests challenge the assumption that a test failure implies a bug, constituting a leading cause of “false alarm” test failures, and potentially more seriously, having the potential to mask the presence of a genuine bug. Flaky tests may lead to time wasted debugging spurious failures, leading developers to ignore future test failures. This is detrimental to software stability, because while a flaky test may be unreliable, it could still indicate a genuine bug in some instances. This is further exacerbated when flaky tests accumulate, as developers may lose trust in the entire test suite.

What are we going to do about it?

Flaky tests as a research topic has grown in interest significantly within the software engineering community in recent years. This has produced a wide array of empirical studies on the causes of flaky tests and experimental tools for their detection and repair. Despite this, no dedicated workshop on the issue has ever been organized. We are therefore delighted to announce the first International Flaky Test Workshop (FTW). The workshop welcomes submissions on topics relating to flaky tests and will provide an opportunity for academic researchers and industrial practitioners to exchange ideas about test flakiness. Please see the Call for Papers for more information.

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

You're viewing the program in a time zone which is different from your device's time zone change time zone

Sun 14 Apr
Displayed time zone: Lisbon change

09:00 - 10:30	KeynoteFTW at Amália Rodrigues Chair(s): Martin Gruber BMW Group, University of Passau

09:00 90m Keynote		Keynote FTW K: Darko Marinov University of Illinois at Urbana-Champaign

10:30 - 11:00	Coffee BreakCatering at Open Space

10:30 30m Coffee break		Break Catering

11:00 - 12:30	Mitigating Flaky Failures in CIFTW at Amália Rodrigues Chair(s): Tim A. D. Henderson Google

11:00 30m Paper		Presubmit Rescue: Automatically Ignoring FlakyTest Executions FTW A: Minh Hoang Google, A: Adrian Berding
11:30 30m Paper		Regression Test History Data for Flaky Test Research FTW A: Philipp Wendler , A: Stefan Winter Ulm University and LMU Munich File Attached
12:00 30m Paper		Predicting the Lifetime of Flaky Tests on Chrome FTW A: Samaneh Malmir Concordia University, A: Peter Rigby Concordia University; Meta

12:30 - 14:00	LunchCatering at Open Space

12:30 90m Lunch		Lunch Catering

14:00 - 15:30	Debugging Flaky Tests in Different DomainsFTW at Amália Rodrigues Chair(s): Owain Parry The University of Sheffield

14:00 30m Paper		On the Impact of Hitting System Resource Limits on Test Flakiness FTW A: Fabian Leinen Technical University of Munich, A: Alexander Perathoner Technical University of Munich, A: Alexander Pretschner TU Munich Pre-print Media Attached
14:30 30m Paper		Flaky Tests in the AI Domain FTW A: Péter Attila Soha Department of Software Engineering, University of Szeged, A: Béla Vancsics , A: Tamás Gergely Department of Software Engineering, University of Szeged, A: Árpád Beszédes Department of Software Engineering, University of Szeged
15:00 30m Paper		Can ChatGPT Repair Non-Order-Dependent Tests? FTW A: Yang Chen University of Illinois at Urbana-Champaign, A: Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign

15:30 - 16:00	Coffee BreakCatering at Open Space

15:30 30m Coffee break		Break Catering

16:00 - 17:30	Discussion PanelFTW at Amália Rodrigues Chair(s): Phil McMinn University of Sheffield

16:00 90m Panel		Discussion Panel FTW P: Jonathan Bell Northeastern University, P: Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland, P: Mark Harman Meta Platforms, Inc. and UCL, P: Darko Marinov University of Illinois at Urbana-Champaign, P: Sigrid Eldh Ericsson AB, Mälardalen University, Carleton Unviersity

Accepted Papers

	Title
	Can ChatGPT Repair Non-Order-Dependent Tests? FTW A: Yang Chen, A: Reyhaneh Jabbarvand
	Flaky Tests in the AI Domain FTW A: Péter Attila Soha, A: Béla Vancsics, A: Tamás Gergely, A: Árpád Beszédes
	On the Impact of Hitting System Resource Limits on Test Flakiness FTW A: Fabian Leinen, A: Alexander Perathoner, A: Alexander Pretschner Pre-print Media Attached
	Predicting the Lifetime of Flaky Tests on Chrome FTW A: Samaneh Malmir, A: Peter Rigby
	Presubmit Rescue: Automatically Ignoring FlakyTest Executions FTW A: Minh Hoang, A: Adrian Berding
	Regression Test History Data for Flaky Test Research FTW A: Philipp Wendler, A: Stefan Winter File Attached

Call for Papers

FTW welcomes submissions on topics relating to flaky tests. The workshop will provide an opportunity for academic researchers and industrial practitioners to exchange ideas about test flakiness and to find out about current research directions and industrial challenges. A major goal of FTW is to foster collaboration and exchange between academia and industry. The workshop is inclusive of quantitative, qualitative, and mixed-methods research. Topics of interest include (but are not limited to):

Costs and consequences of flaky tests.
Causes of flaky tests.
Detection of flaky tests.
Mitigation of flaky tests.
Repair of flaky tests.

We expect a significant portion of the day to be spent on presentations and discussions of extended abstracts, but there will also be more formal short paper presentations. Please note that due to ICSE restrictions, submissions cannot exceed 8 pages. Submissions can take one of two formats:

Extended abstract (max. 2 pages including references): New ideas, problems and challenges, view points, work in progress.
Short paper (max. 8 pages including references): Technical research, experience reports, empirical studies.

Submission

All submissions must be submitted via the following link: https://easychair.org/conferences/?conf=ftw24.

Each submission will be reviewed by the program committee with respect to suitability for the workshop, following a double-blind process for short papers and a single-blind process for extended abstracts. This means that the identity of short paper authors must not be revealed in their submissions. All authors should use the official “ACM Primary Article Template”, as can be obtained from the ACM Proceedings Template page. LaTeX users should use the sigconf option as well as review to produce line numbers. Authors of short papers must also use anonymous to omit author names. For example, a short paper author should include the following line at the beginning of the document:

\documentclass[sigconf,review,anonymous]{acmart}

Important Dates

Paper submission: December 7th 2023 AoE.
Acceptance notification: January 11th 2024 AoE.
Camera ready: January 25th 2024 AoE.

As part of the workshop, we’ll be hosting a discussion panel on test flakiness. The panel consists of researchers and practitioners with a proven track record in the field. The current confirmed panelists are:

Name	Picture	Biography
Jonathan Bell		Jon is an Assistant Professor directing research in Software Engineering and Software Systems at Northeastern University. His research makes it easier for developers to create reliable and secure software by improving software testing and program analysis. Jon’s work on accelerating software testing has been recognized with an ACM SIGSOFT Distinguished Paper Award (ICSE ’14 – Unit Test Virtualization with VMVM), and was the basis for an industrial collaboration with Electric Cloud. His research in flaky tests have led to open source contributions to the Maven build system and Pit mutation testing framework. His program analysis research has resulted in several widely adopted runtime systems for the JVM, including the Phosphor taint tracking system (OOPSLA ’14) and CROCHET checkpoint/rollback tool (ECOOP ’18). His contributions to the object-oriented programming community were recognized with the 2020 Dahl-Nygaard Junior Researcher Prize, and he was invited to give a keynote address at SPLASH on this work. His research has been funded by the NSA and the NSF, and he is the recipient of the NSF CAREER award.
Lionel C. Briand		Lionel C. Briand is professor of software engineering and has shared appointments between (1) The University of Ottawa, Canada and (2) The SnT centre for Security, Reliability, and Trust, University of Luxembourg. In collaboration with colleagues, over 25 years, he has run many collaborative research projects with companies in the automotive, satellite, aerospace, energy, financial, and legal domains. Lionel has held various engineering, academic, and leading positions in six countries. He was one of the founders of the ICST conference (IEEE Int. Conf. on Software Testing, Verification, and Validation, a CORE A event) and its first general chair. He was also EiC of Empirical Software Engineering (Springer) for 13 years and led, in collaboration with first Victor Basili and then Tom Zimmermann, the journal to the top tier of the very best publication venues in software engineering.
Mark Harman		Mark Harman is a full-time Research Scientist at FACEBOOK London, working on FACEBOOK’s Web Enabled Simulation system WW, together with a London-based FACEBOOK team focussing in AI for scalable software engineering. WW is Facebook’s Cyber-Cyber Digital Twin of its platforms, being built with the long-term aim of measuring, predicting and optimising behaviour across all FACEBOOK’s platforms. Mark also holds a part-time professorship at UCL and was previously the manager of FACEBOOK’s Sapienz team team, which grew out of Majicke, a start up co-founded by Mark and acquired by FACEBOOK in 2017. The Sapienz tech has been fully deployed as part of FACEBOOK’s overall CI system since 2017 and the FACEBOOK Sapienz continues to develop and extend it. Sapienz has found and helped to fix thousands of bugs before they hit production, on systems of tens of millions of lines of code, used by over 2.6 billion people world wide every day. In his more purely scientific work, Mark co-founded the field Search Based Software Engineering (SBSE), and is also known for scientific research on source code analysis, software testing, app store analysis and empirical software engineering. He received the IEEE Harlan Mills Award and the ACM Outstanding Research Award in 2019 for his work and was awarded a fellowship of the Royal Academy of Engineering in 2020.
Darko Marinov		Darko Marinov is a Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His main research interests are in Software Engineering, in particular improving software quality using software testing. He has a lot of fun looking for software bugs. He published over 100 conference papers, winning three “test-of-time” awards – two ACM SIGSOFT Impact Paper awards (2012 and 2019) and one ASE Most Influential Paper Award (2015) – and eight more paper awards – seven ACM SIGSOFT Distinguished Paper awards (2002, 2005, 2010, 2015, 2016, 2017, 2021) and one CHI Best Paper Award (2017). His work has been supported by AFRL via BBN, Boeing, Facebook, Google, Huawei, IBM, Intel, Microsoft, NSF, Qualcomm, Samsung, and SRC.
Sigrid Eldh		Dr. Sigrid Eldh currently works full time leading research on Quality and Software Test at Ericsson AB, in Stockholm, where she worked since 1994. She aids in research collaboration and supervision of PhD students as a senior lecturer at MDH and as an adjunct Professor at Carleton University, Ottawa in Canada. She earned her MSc in Computer Science from Uppsala University, and PhD from Mälardalens Högskola titled “On Test Design”. She was the initiator of ISTQB and also started and chaired the Swedish charter, SSTB (Swedish Software Testing Board) the first 7 years. She also started SAST - Swedish Association for Software Test that she chaired the first years. She currently serves as IEEE Software Editor-In-Chief.

We are pleased to announce that Darko Marinov will be our keynote speaker. The work of Darko, alongside his students and collaborators, forms an integral strand of the research literature on flaky tests. He co-authored An Empirical Analysis of Flaky Tests, one of the earliest and most well-cited studies in the field. This work introduced a range of categories for the causes of flaky tests that have been reused and adapted in many subsequent papers. Darko has also been involved in the development and scientific evaluation of several automated tools for dealing with flaky tests, including iDFlakies for detecting flaky tests and iFixFlakies for repairing order-dependent flaky tests. You can browse the full list of his publications on his personal website.

Darko Marinov

Questions? Use the FTW contact form.

1st International Flaky Tests Workshop 2024 (FTW 2024)FTW 2024

What is a flaky test?

Why are they such a big deal?

What are we going to do about it?

Program Display Configuration

Sun 14 AprDisplayed time zone: Lisbon change

Accepted Papers

Call for Papers

Submission

Important Dates

Discussion Panel

Keynote Speaker

Tim A. D. Henderson

Google

United States

Owain Parry

The University of Sheffield

United Kingdom

Martin Gruber

BMW Group, University of Passau

Germany

Phil McMinn

University of Sheffield

United Kingdom

Gordon Fraser

University of Passau

Owain ParryPC Chair

The University of Sheffield

United Kingdom

Phil McMinnPC Chair

University of Sheffield

United Kingdom

Jonathan Bell

Northeastern University

United States

Antonia Bertolino

National Research Council, Italy

Italy

Filomena Ferrucci

University of Salerno

Italy

Mark Harman

Meta Platforms, Inc. and UCL

United Kingdom

Michael Hilton

Carnegie Mellon University

United States

Gregory Kapfhammer

Allegheny College

United States

Wing Lam

George Mason University

United States

Darko Marinov

University of Illinois at Urbana-Champaign

United States

Bao N. Nguyen

F5, USA

United States

Peter Rigby

Concordia University; Meta

Canada

August Shi

The University of Texas at Austin

United States

Silvia Regina Vergilio

Federal University of Paraná

Brazil

Sun 14 Apr
Displayed time zone: Lisbon change