ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Thu 18 Apr 2024 11:00 - 11:15 at Grande Auditório - Testing 3 Chair(s): José Miguel Rojas

Non-deterministic test behavior, or flakiness, is common and dreaded among developers. Researchers have studied the issue and pro- posed approaches to mitigate it. However, the vast majority of previous work has only considered developer-written tests. The prevalence and nature of flaky tests produced by test generation tools remains largely unknown. We ask whether such tools also produce flaky tests and how these differ from developer-written ones. Furthermore, we evaluate mechanisms that suppress flaky test generation. We sample 6 356 projects written in Java or Python. For each project, we generate tests using EvoSuite (Java) and Pynguin (Python), and execute each test 200 times, looking for inconsistent outcomes. Our results show that flakiness is at least as common in generated tests as in developer-written tests. Nevertheless, exist- ing flakiness suppression mechanisms are effective in alleviating this issue (71.7 % fewer flaky tests). Compared to developer-written flaky tests, the causes of generated flaky tests are distributed differ- ently. Their non-deterministic behavior is more frequently caused by randomness, rather than by networking and concurrency. Using flakiness suppression, the remaining flaky tests differ significantly from any flakiness previously reported, where most are attributable to runtime optimizations and EvoSuite-internal resource thresholds. These insights, with the accompanying dataset, can help maintain- ers to improve test generation tools, give recommendations for developers using these tools, and serve as a foundation for future research in test flakiness or test generation.

Thu 18 Apr

Displayed time zone: Lisbon change

11:00 - 12:30
11:00
15m
Talk
Do Automatic Test Generation Tools Generate Flaky Tests?
Research Track
Martin Gruber BMW Group, University of Passau, Muhammad Firhard Roslan University of Sheffield, Owain Parry The University of Sheffield, Fabian Scharnböck University of Passau, Phil McMinn University of Sheffield, Gordon Fraser University of Passau
Pre-print
11:15
15m
Talk
Deep Combination of CDCL(T) and Local Search for Satisfiability Modulo Non-Linear Integer Arithmetic Theory
Research Track
Xindi Zhang Institute of Software Chinese Academy of Science, Bohan Li Institute of Software Chinese Academy of Science, Shaowei Cai Institute of Software at Chinese Academy of Sciences
11:30
15m
Talk
Uncover the Premeditated Attacks: Detecting Exploitable Reentrancy Vulnerabilities by Identifying Attacker Contracts
Research Track
Shuo Yang Sun Yat-sen University, Jiachi Chen Sun Yat-sen University, Mingyuan Huang Sun Yat-Sen University, Zibin Zheng Sun Yat-sen University, Yuan Huang School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
11:45
15m
Talk
Practical Non-Intrusive GUI Exploration Testing with Visual-based Robotic Arms
Research Track
Shengcheng Yu Nanjing University, Chunrong Fang Nanjing University, Mingzhe Du Nanjing University, Yuchen Ling Nanjing University, Zhenyu Chen Nanjing University, Zhendong Su ETH Zurich
12:00
15m
Talk
Dynamic Inference of Likely Symbolic Tensor Shapes in Python Machine Learning Programs
Software Engineering in Practice
Dan Zheng Google DeepMind, Koushik Sen Google DeepMind
Pre-print
12:15
7m
Talk
Mutation Analysis for Evaluating Code Translation
Journal-first Papers
Giovani Guizzo Brick Abode, Jie M. Zhang King's College London, Federica Sarro University College London, Mark Harman Meta Platforms, Inc. and UCL, Christoph Treude Singapore Management University
12:22
7m
Talk
Generalized Coverage Criteria for Combinatorial Sequence Testing
Journal-first Papers
Achiya Elyasaf Ben-Gurion University of the Negev, Eitan Farchi IBM Haifa Research Lab, Oded Margalit Ben-Gurion University of the Negev, Gera Weiss Ben-Gurion University of the Negev, ‪Yeshayahu Weiss‬‏ Ben-Gurion University of the Negev
Link to publication DOI