Enriching automatic test case generation by extracting relevant test inputs from bug reports (ICSME 2025 - Journal First Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Wendkuuni Arzouma Marc Christian OUEDRAOGO, Laura Plein, Abdoul Kader Kaboré, Andrew Habib, Jacques Klein, David Lo, Tegawendé F. Bissyandé

Track

ICSME 2025 Journal First Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 10 Sep 2025 14:35 - 14:50 at Case Room 2 260-057 - Session 4 - Testing 1 Chair(s): Sigrid Eldh

Abstract

The quality of software is closely tied to the effectiveness of the tests it undergoes. Manual test writing, though crucial for bug detection, is time-consuming, which has driven significant research into automated test case generation. However, current methods often struggle to generate relevant inputs, limiting the effectiveness of the tests produced. To address this, we introduce BRMiner, a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports, thereby enhancing automated test generation tools. In this study, we evaluate BRMiner using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop. Our results demonstrate that BRMiner achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%, significantly outperforming methods that rely on LLMs alone. The integration of BRMiner’s input enhances EvoSuite ability to generate more effective test, leading to increased code coverage, with gains observed in branch, instruction, method, and line coverage across multiple projects. Furthermore, BRMiner facilitated the detection of 58 unique bugs, including those that were missed by traditional baseline approaches. Overall, BRMiner’s combination of LLM filtering with traditional input extraction techniques significantly improves the relevance and effectiveness of automated test generation, advancing the detection of bugs and enhancing code coverage, thereby contributing to higher-quality software development.

Wendkuuni Arzouma Marc Christian OUEDRAOGO

University of Luxembourg

Luxembourg

Laura Plein

CISPA Helmholtz Center for Information Security

Abdoul Kader Kaboré

University of Luxembourg

Luxembourg

Andrew Habib

ABB Corporate Research, Germany

Germany

Jacques Klein

University of Luxembourg

Luxembourg

David Lo

Singapore Management University

Singapore

Tegawendé F. Bissyandé

University of Luxembourg

Luxembourg

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 10 Sep
Displayed time zone: Auckland, Wellington change

13:30 - 15:00	Session 4 - Testing 1Research Papers Track / Registered Reports / Journal First Track / NIER Track / Industry Track / Tool Demonstration Track at Case Room 2 260-057 Chair(s): Sigrid Eldh Ericsson AB, Mälardalen University, Carleton University

13:30 15m		Performance Testing in Open-Source Web Projects: Adoption, Maintenance, and a Change Taxonomy Research Papers Track Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II, Valeria Pontillo Gran Sasso Science Institute, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel, Sergio Di Martino Università degli Studi di Napoli Federico II Pre-print
13:45 15m		Harnessing LLMs for Document-Guided Fuzzing of OpenCV Library Research Papers Track Bin Duan The University of Queensland, Tarek Mahmud Texas State University, Meiru Che Central Queensland University, Yan Yan University of Illinois Chicago, Naipeng Dong The University of Queensland, Australia, Dan Dongseong Kim The University of Queensland, Guowei Yang University of Queensland
14:00 10m		XTestGen: Natural Language to Maintainable E2E Test Scripts with LLMs Tool Demonstration Track Hiroyuki Kirinuki NTT, Masaki Tajima NTT Software Innovation Center, Kei Wakabayashi NTT File Attached
14:10 10m		Towards Effective Lightweight Test Oracles for Automated Multi-Fault Program Repair NIER Track Omar I. Al-Bataineh Gran Sasso Science Institute (GSSI)
14:20 15m		Testing Is Not Boring: Characterizing Challenge in Software Testing Tasks Industry Track Davi Gama Hardman CESAR - Recife Center for Advanced Studies and Systems, César França Federal Rural University of Pernambuco (UFRPE), Brody Stuart-Verner University of Calgary, Ronnie de Souza Santos University of Calgary
14:35 15m		Enriching automatic test case generation by extracting relevant test inputs from bug reports Journal First Track Wendkuuni Arzouma Marc Christian OUEDRAOGO University of Luxembourg, Laura Plein CISPA Helmholtz Center for Information Security, Abdoul Kader Kaboré University of Luxembourg, Andrew Habib ABB Corporate Research, Germany, Jacques Klein University of Luxembourg, David Lo Singapore Management University, Tegawendé F. Bissyandé University of Luxembourg
14:50 10m		An Empirical Study of Complexity, Heterogeneity, and Compliance of GitHub Actions Workflows Registered Reports Edward Abrokwah Department of Computer Science, Trent University, Peterborough, Canada, Taher A. Ghaleb Trent University Pre-print