ICST 2025
Mon 31 March - Fri 4 April 2025 Naples, Italy

This program is tentative and subject to change.

Thu 3 Apr 2025 11:15 - 11:30 at Room A - LLMs in Testing Chair(s): Valerio Terragni

Generating tests automatically is a key and ongoing area of focus in software engineering research. The emergence of Large Language Models (LLMs) has opened up new opportunities, given their ability to perform a wide spectrum of tasks. However, the effectiveness of LLM-based approaches compared to traditional techniques such as search-based software testing (SBST) and symbolic execution remains uncertain. In this paper, we perform an extensive study of automatic test generation approaches based on three tools: EvoSuite for SBST, Kex for symbolic execution, and TestSpark for LLM-based test generation. We evaluate tools’ performance on the GitBug Java dataset and compare them using various execution-based and feature-based metrics. Our results show that while LLM-based test generation is promising, it falls behind traditional methods in terms of coverage. However, it significantly outperforms them in mutation scores, suggesting that LLMs provide a deeper semantic understanding of code. LLM-based approach also performed worse than SBST and symbolic execution-based approaches w.r.t. fault detection capabilities. Additionally, our feature-based analysis shows that all tools are primarily affected by the complexity and internal dependencies of the class under test (CUT), with LLM-based approaches being especially sensitive to the CUT size.

This program is tentative and subject to change.

Thu 3 Apr

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:22
LLMs in TestingResearch Papers / Short Papers, Vision and Emerging Results at Room A
Chair(s): Valerio Terragni University of Auckland
11:00
15m
Talk
Improving the Readability of Automatically Generated Tests using Large Language Models
Research Papers
Matteo Biagiola Università della Svizzera italiana, Gianluca Ghislotti Università della Svizzera italiana, Paolo Tonella USI Lugano
11:15
15m
Talk
Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation
Research Papers
Azat Abdullin JetBrains Research, TU Delft, Pouria Derakhshanfar JetBrains Research, Annibale Panichella Delft University of Technology
11:30
15m
Talk
Benchmarking Open-source Large Language Models For Log Level Suggestion
Research Papers
Yi Wen HENG Concordia University, Zeyang Ma Concordia University, Zhenhao Li York University, Dong Jae Kim DePaul University, Tse-Hsun (Peter) Chen Concordia University
11:45
15m
Talk
Understanding and Enhancing Attribute Prioritization in Fixing Web UI Tests with LLMs
Research Papers
Zhuolin Xu Concordia University, Qiushi Li Concordia University, Shin Hwei Tan Concordia University
12:00
15m
Talk
Benchmarking Generative AI Models for Deep Learning Test Input Generation
Research Papers
Maryam Maryam University of Udine, Matteo Biagiola Università della Svizzera italiana, Andrea Stocco Technical University of Munich, fortiss, Vincenzo Riccio University of Udine
Pre-print
12:15
7m
Talk
Leveraging Large Language Models for Explicit Wait Management in End-to-End Web Testing
Short Papers, Vision and Emerging Results
Dario Olianas DIBRIS, University of Genova, Italy, Maurizio Leotta DIBRIS, University of Genova, Italy, Filippo Ricca Università di Genova