ICSME 2025
Sun 7 - Fri 12 September 2025 Auckland, New Zealand

This program is tentative and subject to change.

Thu 11 Sep 2025 14:45 - 14:55 at Case Room 3 260-055 - Session 9 - Testing 3 Chair(s): Sigrid Eldh

Large Language Models (LLMs) are showing remarkable performance in generating source code, yet the generated code often has issues like compilation errors or incorrect code. Researchers and developers often face wasted effort in implementing checks and refining LLM-generated code, frequently duplicating their efforts. This paper presents LLMLOOP, a framework that automates the refinement of both source code and test cases produced by LLMs. LLMLOOP employs five iterative loops: resolving compilation errors, addressing static analysis issues, fixing test case failures, and improving test quality through mutation analysis. These loops ensure the generation of high-quality test cases that serve as both a validation mechanism and a regression test suite for the generated code. We evaluated LLMLOOP on HUMANEVAL-X, a recent benchmark of programming tasks. Results demonstrate the tool effectiveness in refining LLM-generated outputs. A demonstration video of the tool is available at https://youtu.be/2CLG9x1fsNI

This program is tentative and subject to change.

Thu 11 Sep

Displayed time zone: Auckland, Wellington change

13:30 - 15:00
Session 9 - Testing 3Journal First Track / NIER Track / Tool Demonstration Track / Research Papers Track / Registered Reports at Case Room 3 260-055
Chair(s): Sigrid Eldh Ericsson AB, Mälardalen University, Carleton University
13:30
15m
Full-paper
Metamorphic Testing of Large Language Models for Natural Language Processing
Research Papers Track
Steven Cho The University of Auckland, New Zealand, Stefano Ruberto JRC European Commission, Valerio Terragni University of Auckland
Pre-print
13:45
15m
Onweer: Automated Resilience Testing through Fuzzing
Research Papers Track
Gilles Coremans Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
Pre-print
14:00
10m
Generating Highly Structured Test Inputs Leveraging Constraint-Guided Graph Refinement
Registered Reports
Zhaorui Yang University of California, Riverside, Yuxin Qiu University of California at Riverside, Haichao Zhu Meta, Qian Zhang University of California at Riverside
14:10
10m
Prioritizing Test Smells: An Empirical Evaluation of Quality Metrics and Developer Perceptions
NIER Track
Md Arif Hasan University of Dhaka, Bangladesh, Toukir Ahammed Institute of Information Technology, University of Dhaka
14:20
10m
LLMShot: Reducing snapshot testing maintanence via LLMs
NIER Track
Ergün Batuhan Kaynak Bilkent University, Mayasah Lami Bilkent University, Sahand Moslemi Yengejeh Bilkent University, Anil Koyuncu Bilkent University
Pre-print
14:30
15m
Combinatorial Transition Testing in Dynamically Adaptive Systems: Implementation and Test Oracle
Journal First Track
Pierre Martou UCLouvain / ICTEAM, Benoît Duhoux Université catholique de Louvain, Belgium, Kim Mens Université catholique de Louvain, ICTEAM institute, Belgium, Axel Legay Université Catholique de Louvain, Belgium
14:45
10m
LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops
Tool Demonstration Track
Ravin Ravi University of Auckland, Dylan Bradshaw University of Auckland, Stefano Ruberto JRC European Commission, Gunel Jahangirova King's College London, Valerio Terragni University of Auckland
Pre-print