LLMs for Automated Unit Test Generation and Assessment in Java: The AgoneTest Framework (ASE 2025 - Research Papers)

Sun 16 - Thu 20 November 2025 Seoul, South Korea

Who

Andrea Lops, Fedelucio Narducci, Azzurra Ragone, Michelantonio Trizio, Claudio Bartolini

Track

ASE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 14:00 - 14:10 at Vista - Test Generation, Selection & Prioritization 2 Chair(s): Darko Marinov

Abstract

Unit testing is an essential but resource-intensive step in software development, ensuring individual code units function correctly. This paper introduces AgoneTest, an automated system designed to generate and evaluate unit test suites for real-world Java projects using Large Language Models (LLMs). We provide a newly developed Classes2Test dataset, which maps Java focal classes to their test counterparts, and a framework that integrates advanced evaluation metrics, such as mutation coverage and test smells, for a comprehensive assessment. Experimental results show that, for the subset of tests that compile, LLM-generated tests can match or exceed human-written tests in terms of coverage and defect detection. Enhanced prompting strategies also contribute to test quality. AgoneTest automatically evaluates the potential of LLMs in automating software testing, offering insights for future improvements in model design, prompt engineering, and testing practices.

Link to Preprint

https://arxiv.org/abs/2511.20403

Andrea Lops

Polytechnic University of Bari, Italy

Italy

Fedelucio Narducci

Polytechnic University of Bari

Azzurra Ragone

University of Bari

Italy

Michelantonio Trizio

Wideverse

Claudio Bartolini

Wideverse s.r.l.

Italy

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Nov
Displayed time zone: Seoul change

14:00 - 15:30	Test Generation, Selection & Prioritization 2Research Papers / Journal-First at Vista Chair(s): Darko Marinov University of Illinois at Urbana-Champaign

14:00 10m Talk		LLMs for Automated Unit Test Generation and Assessment in Java: The AgoneTest Framework Research Papers Andrea Lops Polytechnic University of Bari, Italy, Fedelucio Narducci Polytechnic University of Bari, Azzurra Ragone University of Bari, Michelantonio Trizio Wideverse, Claudio Bartolini Wideverse s.r.l. Pre-print
14:10 10m Talk		µOpTime: Statically Reducing the Execution Time of Microbenchmark Suites Using Stability Metrics Journal-First Nils Japke TU Berlin & ECDF, Martin Grambow TU Berlin & ECDF, Christoph Laaber Simula Research Laboratory, David Bermbach TU Berlin
14:20 10m Talk		Reference-Based Retrieval-Augmented Unit Test Generation Journal-First Zhe Zhang Beihang University, Liu Xingyu Beihang University, Yuanzhang Lin Beihang University, Xiang Gao Beihang University, Hailong Sun Beihang University, Yuan Yuan Beihang University
14:30 10m Talk		Using Active Learning to Train Predictive Mutation Testing with Minimal Data Research Papers Miklos Borsi Karlsruhe Institute of Technology
14:40 10m Talk		Clarifying Semantics of In-Context Examples for Unit Test Generation Research Papers Chen Yang Tianjin University, Lin Yang Tianjin University, Ziqi Wang Tianjin University, Dong Wang Tianjin University, Jianyi Zhou Huawei Cloud Computing Technologies Co., Ltd., Junjie Chen Tianjin University
14:50 10m Talk		An empirical study of test case prioritization on the Linux Kernel Journal-First Haichi Wang College of Intelligence and Computing, Tianjin University, Ruiguo Yu College of Intelligence and Computing, Tianjin University, Dong Wang Tianjin University, Yiheng Du College of Intelligence and Computing, Tianjin University, Yingquan Zhao Tianjin University, Junjie Chen Tianjin University, Zan Wang Tianjin University
15:00 10m Talk		Automated Generation of Issue-Reproducing Tests by Combining LLMs and Search-Based Testing Research Papers Konstantinos Kitsios University of Zurich, Marco Castelluccio Mozilla, Alberto Bacchelli University of Zurich Pre-print
15:10 10m Talk		Using Fourier Analysis and Mutant Clustering to Accelerate DNN Mutation Testing Research Papers Ali Ghanbari Auburn University, Sasan Tavakkol Google Research
15:20 10m Talk		WEST: Specification-Based Test Generation for WebAssembly Research Papers Dongjun Youn KAIST, Shin Wonho KAIST, Sukyoung Ryu KAIST