XTestGen: Natural Language to Maintainable E2E Test Scripts with LLMs (ICSME 2025 - Tool Demonstration Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Hiroyuki Kirinuki, Masaki Tajima, Kei Wakabayashi

Track

ICSME 2025 Tool Demonstration Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 10 Sep 2025 14:00 - 14:10 at Case Room 2 260-057 - Session 4 - Testing 1 Chair(s): Sigrid Eldh

Abstract

Recent advances in web agent technologies have enabled automated web browser operations through natural language instructions. While these technologies show promise for end-to-end test automation, significant challenges remain, such as uncertainty in test execution due to LLMs, increased execution time and cost, and decreased accuracy in identifying operation targets as web pages become more complex. To address these challenges, we propose XTestGen, which generates Gherkin-format test cases and JavaScript step definitions from natural language. XTestGen improves reproducibility by producing deterministic scripts, enhances maintainability through modular step reuse and scenario abstraction, and increases element identification accuracy in complex web pages using hierarchical tree exploration. Our evaluation shows that XTestGen enables abstraction and reuse in test generation and achieves higher accuracy in element identification than naive approaches. A demonstration video is available at: https://youtu.be/sQmsNCPGtPo

File attachments

Pre-print (paper.pdf)	240KiB

Hiroyuki Kirinuki

NTT

Japan

Masaki Tajima

NTT Software Innovation Center

Japan

Kei Wakabayashi

NTT

Japan

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 10 Sep
Displayed time zone: Auckland, Wellington change

13:30 - 15:00	Session 4 - Testing 1Research Papers Track / Registered Reports / Journal First Track / NIER Track / Industry Track / Tool Demonstration Track at Case Room 2 260-057 Chair(s): Sigrid Eldh Ericsson AB, Mälardalen University, Carleton University

13:30 15m		Performance Testing in Open-Source Web Projects: Adoption, Maintenance, and a Change Taxonomy Research Papers Track Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II, Valeria Pontillo Gran Sasso Science Institute, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel, Sergio Di Martino Università degli Studi di Napoli Federico II Pre-print
13:45 15m		Harnessing LLMs for Document-Guided Fuzzing of OpenCV Library Research Papers Track Bin Duan The University of Queensland, Tarek Mahmud Texas State University, Meiru Che Central Queensland University, Yan Yan University of Illinois Chicago, Naipeng Dong The University of Queensland, Australia, Dan Dongseong Kim The University of Queensland, Guowei Yang University of Queensland
14:00 10m		XTestGen: Natural Language to Maintainable E2E Test Scripts with LLMs Tool Demonstration Track Hiroyuki Kirinuki NTT, Masaki Tajima NTT Software Innovation Center, Kei Wakabayashi NTT File Attached
14:10 10m		Towards Effective Lightweight Test Oracles for Automated Multi-Fault Program Repair NIER Track Omar I. Al-Bataineh Gran Sasso Science Institute (GSSI)
14:20 15m		Testing Is Not Boring: Characterizing Challenge in Software Testing Tasks Industry Track Davi Gama Hardman CESAR - Recife Center for Advanced Studies and Systems, César França Federal Rural University of Pernambuco (UFRPE), Brody Stuart-Verner University of Calgary, Ronnie de Souza Santos University of Calgary
14:35 15m		Enriching automatic test case generation by extracting relevant test inputs from bug reports Journal First Track Wendkuuni Arzouma Marc Christian OUEDRAOGO University of Luxembourg, Laura Plein CISPA Helmholtz Center for Information Security, Abdoul Kader Kaboré University of Luxembourg, Andrew Habib ABB Corporate Research, Germany, Jacques Klein University of Luxembourg, David Lo Singapore Management University, Tegawendé F. Bissyandé University of Luxembourg
14:50 10m		An Empirical Study of Complexity, Heterogeneity, and Compliance of GitHub Actions Workflows Registered Reports Edward Abrokwah Department of Computer Science, Trent University, Peterborough, Canada, Taher A. Ghaleb Trent University Pre-print