ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Sun 14 Apr 2024 15:00 - 15:30 at Amália Rodrigues - Debugging Flaky Tests in Different Domains Chair(s): Owain Parry

Regression testing helps developers check whether the latest code changes break software functionality. Flaky tests, which can non- deterministically pass or fail on the same code version, may mislead developers’ concerns, resulting in missing some bugs or spending time pinpointing bugs that do not exist. Existing flakiness detection and mitigation techniques have primarily focused on general order-dependent (OD) and implementation-dependent (ID) flaky tests. There is also a dearth of research on repairing test flakiness, out of which, mostly have focused on repairing OD flaky tests, and a few have explored repairing a subcategory of non-order-dependent (NOD) flaky tests that are caused by asynchronous waits. As a result, there is a demand for devising techniques to reproduce, detect, and repair NOD flaky tests. Large language models (LLMs) have shown great effectiveness in several programming tasks. To explore the potential of LLMs in addressing NOD flakiness, this paper investigates the possibility of using ChatGPT to repair different categories of NOD flaky tests. Our comprehensive study on 118 from the IDoFT dataset shows that ChatGPT, despite as a leading LLM with notable success in multiple code generation tasks, is ineffective in repairing NOD test flakiness, even by following the best practices for prompt crafting. We investigated the reasons behind the failure of using ChatGPT in repairing NOD tests, which provided us valuable insights about the next step to advance the field of NOD test flakiness repair.

Sun 14 Apr

Displayed time zone: Lisbon change

14:00 - 15:30
Debugging Flaky Tests in Different DomainsFTW at Amália Rodrigues
Chair(s): Owain Parry The University of Sheffield
14:00
30m
Paper
On the Impact of Hitting System Resource Limits on Test Flakiness
FTW
A: Fabian Leinen Technical University of Munich, A: Alexander Perathoner Technical University of Munich, A: Alexander Pretschner TU Munich
Pre-print Media Attached
14:30
30m
Paper
Flaky Tests in the AI Domain
FTW
A: Péter Attila Soha Department of Software Engineering, University of Szeged, A: Béla Vancsics , A: Tamás Gergely Department of Software Engineering, University of Szeged, A: Árpád Beszédes Department of Software Engineering, University of Szeged
15:00
30m
Paper
Can ChatGPT Repair Non-Order-Dependent Tests?
FTW
A: Yang Chen University of Illinois at Urbana-Champaign, A: Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign