ICST 2025
Mon 31 March - Fri 4 April 2025 Naples, Italy

This program is tentative and subject to change.

Thu 3 Apr 2025 12:00 - 12:15 at Room A - LLMs in Testing Chair(s): Valerio Terragni

Test Input Generators (TIGs) are crucial to assess the ability of Deep Learning (DL) image classifiers to provide correct predictions for inputs beyond their training and test sets. Recent advancements in Generative AI (GenAI) models have made them a powerful tool for creating and manipulating synthetic images, although these advancements also imply increased complexity and resource demands for training.

In this work, we benchmark and combine different GenAI models with TIGs, assessing their effectiveness, efficiency, and quality of the generated test images, in terms of domain validity and label preservation. We conduct an empirical study involving three different GenAI architectures (VAEs, GANs, Diffusion Models), five classification tasks of increasing complexity, and 364 human evaluations. Our results show that simpler architectures, such as VAEs, are sufficient for less complex datasets like MNIST. However, when dealing with feature-rich datasets, such as ImageNet, more sophisticated architectures like Diffusion Models achieve superior performance by generating a higher number of valid, misclassification-inducing inputs.

This program is tentative and subject to change.

Thu 3 Apr

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:22
LLMs in TestingResearch Papers / Short Papers, Vision and Emerging Results at Room A
Chair(s): Valerio Terragni University of Auckland
11:00
15m
Talk
Improving the Readability of Automatically Generated Tests using Large Language Models
Research Papers
Matteo Biagiola Università della Svizzera italiana, Gianluca Ghislotti Università della Svizzera italiana, Paolo Tonella USI Lugano
11:15
15m
Talk
Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation
Research Papers
Azat Abdullin JetBrains Research, TU Delft, Pouria Derakhshanfar JetBrains Research, Annibale Panichella Delft University of Technology
11:30
15m
Talk
Benchmarking Open-source Large Language Models For Log Level Suggestion
Research Papers
Yi Wen HENG Concordia University, Zeyang Ma Concordia University, Zhenhao Li York University, Dong Jae Kim DePaul University, Tse-Hsun (Peter) Chen Concordia University
11:45
15m
Talk
Understanding and Enhancing Attribute Prioritization in Fixing Web UI Tests with LLMs
Research Papers
Zhuolin Xu Concordia University, Qiushi Li Concordia University, Shin Hwei Tan Concordia University
12:00
15m
Talk
Benchmarking Generative AI Models for Deep Learning Test Input Generation
Research Papers
Maryam Maryam , Matteo Biagiola Università della Svizzera italiana, Andrea Stocco Technical University of Munich, fortiss, Vincenzo Riccio University of Udine
Pre-print
12:15
7m
Talk
Leveraging Large Language Models for Explicit Wait Management in End-to-End Web Testing
Short Papers, Vision and Emerging Results
Dario Olianas DIBRIS, University of Genova, Italy, Maurizio Leotta DIBRIS, University of Genova, Italy, Filippo Ricca Università di Genova
:
:
:
: