3rd International Flaky Tests Workshop 2026 (FTW 2026)FTW 2026
What is a flaky test?
Software developers rely on test cases to identify bugs in their code and to provide a signal as to their code’s correctness. Should such signals have a history of unreliability, they not only become less informative, but may also be considered untrustworthy. In the context of software testing, practitioners refer to these unreliable signals as flaky tests. The definition varies slightly, but a flaky test is generally defined as a test case that can pass and fail without changes to the test case code or the code under test.
Why are they such a big deal?
Concurrency and randomness are well-established causes among many others, though flakiness has far-reaching negative consequences regardless of origin. These consequences are felt by developers from small open-source projects to the likes of Google, Meta, and Microsoft. Flaky tests challenge the assumption that a test failure implies a bug, constituting a leading cause of “false alarm” test failures, and potentially more seriously, having the potential to mask the presence of a genuine bug. Flaky tests may lead to time wasted debugging spurious failures, leading developers to ignore future test failures. This is detrimental to software stability, because while a flaky test may be unreliable, it could still indicate a genuine bug in some instances. This is further exacerbated when flaky tests accumulate, as developers may lose trust in the entire test suite.
What are we going to do about it?
Flaky tests as a research topic has grown in interest significantly within the software engineering community in recent years. This has produced a wide array of empirical studies on the causes of flaky tests and experimental tools for their detection and repair. The International Flaky Test Workshop (FTW) will be held for the third consecutive time at ICSE 2026. The workshop welcomes submissions on topics relating to flaky tests and will provide an opportunity for academic researchers and industrial practitioners to exchange ideas about test flakiness. The workshop will feature a panel discussion with leading experts from both academia and industry. Please see the Call for Papers for more information.
Mon 13 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
08:00 - 17:30 | |||
08:00 9h30mRegistration | ICSE 2026 Registration Social, Networking and Special Rooms | ||
09:00 - 10:30 | |||
09:00 5mTalk | Opening DeepTest | ||
09:05 55mKeynote | Vibe Coding ≠ Vibe Testing: What Happens When No One Reads Source Code DeepTest Antonio Mastropaolo William and Mary, USA | ||
10:00 15mTalk | Beyond Accuracy: Characterizing Code Comprehension Capabilities in (Large) Language Models DeepTest Felix Mächtle University of Luebeck, Jan-Niclas Serr University of Luebeck, Nils Loose University of Luebeck, Thomas Eisenbarth University of Lübeck | ||
10:15 15mTalk | Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study DeepTest Kohei Dozono Technical University of Munich, Tiago Espinha Gasiba Siemens AG, Andrea Stocco Technical University of Munich, fortiss Pre-print | ||
10:30 - 11:00 | Monday Morning BreakCatering at Catering and Exhibition Hall (Europa I to IV) This break will provide an opportunity for networking and relaxation between sessions. | ||
10:30 30mCoffee break | Break Catering | ||
11:00 - 12:30 | |||
11:00 15mTalk | Latent Regularization in Generative Test Input Generation DeepTest Giorgi Merabishvili North Carolina State University, Oliver Weissl Technical University of Munich & fortiss, Andrea Stocco Technical University of Munich, fortiss Pre-print | ||
11:15 15mTalk | Tool Competition Opening DeepTest | ||
11:30 15mTalk | Warnless @ DeepTest 2026 Tool Competition DeepTest Qunying Song University College London, Yuan Gao , Roberto Brusnicki , Federica Sarro University College London | ||
11:45 15mTalk | Exida Test Generator @ DeepTest 2026 Tool Competition DeepTest | ||
12:00 15mTalk | Contextual Risk-Driven Input Structuring for Probing (CRISP) @ DeepTest 2026 Tool Competition DeepTest | ||
12:15 15mTalk | ATLAS: Adaptive Test Learning And Selection @ DeepTest 2026 Tool Competition DeepTest Antonio Pedro Santos Alves Pontifical Catholic University of Rio de Janeiro, Marcos Kalinowski Pontifical Catholic University of Rio de Janeiro (PUC-Rio) | ||
12:30 - 14:00 | Monday LunchCatering at Catering and Exhibition Hall (Europa I to IV) Lunch time with a variety of meal options available for attendees, including vegetarian choices. This session will provide an opportunity for attendees to enjoy a meal while networking with colleagues and discussing the day’s events. | ||
12:30 90mLunch | Lunch Catering | ||
14:00 - 15:30 | |||
14:00 15mDay opening | Opening FTW Joanna Kisaakye University of Antwerp, Wing Lam George Mason University, Fabian Leinen Technical University of Munich, August Shi The University of Texas at Austin | ||
14:15 45mKeynote | Keynote: The Ghost in the Machine - Why we still have not solved Flaky tests FTW Sigrid Eldh Ericsson AB, Mälardalen University, Carleton University | ||
15:00 30mPanel | Panel: Flaky Tests in Industry FTW | ||
15:30 - 16:00 | Monday Afternoon BreakCatering at Catering and Exhibition Hall (Europa I to IV) Afternoon Break with a variety of beverages and snacks available for attendees. This break will provide an opportunity for networking and relaxation between sessions. | ||
15:30 30mCoffee break | Break Catering | ||
16:00 - 17:30 | |||
16:00 20mTalk | A Preliminary Study on the Vocabulary of Flaky Tests in Swift FTW | ||
16:20 20mTalk | Flaky Tests in a Large Industrial Database Management System: An Empirical Study of Fixed Issue Reports for SAP HANA FTW Pre-print | ||
16:40 20mTalk | Preliminary Results on Evaluating Large Language Models for Labeling Root Cause Categories of Fixed Flaky Tests FTW Yang Chen University of Illinois at Urbana-Champaign, Kaiyao Ke University of California Berkeley, Darko Marinov University of Illinois at Urbana-Champaign | ||
17:00 30mPanel | Panel: Future of Flaky Test Research in the Era of Generative AI FTW | ||
20:00 - 23:00 | Social Event for Co-located ConferencesSocial, Networking and Special Rooms at Rio Scenarium Co-located event participants are invited to join us at Rio Scenarium for an informal evening with live Brazilian music, food, drinks, and great company in the heart of Lapa, a traditional samba region in Rio. Buses depart from the conference venue starting at 18:00. | ||
20:00 3hDinner | Social Event for Co-located Conferences Social, Networking and Special Rooms | ||
Accepted Papers
Call for Papers
The primary objective of FTW is to foster collaboration and exchange between academia and industry. FTW welcomes submissions on topics relating to flaky tests and non-determinism in testing generally. The workshop will provide an opportunity for academic researchers and industrial practitioners to exchange ideas about test flakiness and to find out about current research directions and industrial challenges. The workshop is inclusive of quantitative, qualitative, and mixed-methods research. Topics of interest include (but are not limited to):
- Causes of flaky tests.
- Costs and consequences of flaky tests.
- Debugging of flaky tests.
- Detection of flaky tests.
- Mitigation of flaky tests.
- Non-determinism in testing generally.
- Repair of flaky tests.
We expect a significant portion of the day to be spent on presentations and discussions of extended abstracts, but there will also be more formal short paper presentations. Please note that due to ICSE restrictions, submissions cannot exceed 8 pages. Submissions can take one of two formats:
- Extended abstract (max. 2 pages including references): New ideas, problems and challenges, view points, work in progress. Extended abstracts are free of APC (article processing charge).
- Short paper (max. 6 pages + 2 pages references): Technical research, experience reports, empirical studies.
Submission
All submissions must be made via the following link: https://icse2026-ftw.hotcrp.com/
Each submission will be reviewed by the program committee with respect to suitability for the workshop, following a double-blind process for short papers and a single-blind process for extended abstracts. This means that the identity of short paper authors must not be revealed in their submissions. Please note that for this year’s submission, you are required to use the official “ACM Primary Article Template”. You can get this from the ACM Proceedings Template page. For those using LaTeX, make sure to use the sigconf option as well as the review option for line numbers to facilitate easy referencing by reviewers. As such, you can include the following LaTeX code at the beginning of your LaTeX document: \documentclass[sigconf,review]{acmart}.
More precise submission policies and formatting guidance can be found within the ICSE 2026 Research Track submission process.