TRACETS4J: A Traceable Unit Test Generation Dataset (SANER 2025 - Research Papers)

Who

Xuancheng Jin, Zhuang Liu, Junwei Zhang, Xing Hu, Xin Xia

Track

SANER 2025 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 7 Mar 2025 15:45 - 16:00 at L-1720 - Software Testing & Debugging Chair(s): Gilberto Recupito

Abstract

Automated test case generation enhances the efficiency and quality of software testing. Learning-based test case generation methods require an understanding of the relationships between test cases and focal methods. Accurate traceability links between test cases and focal methods provide a clear relational model, facilitating more effective training of test case generation models. However, existing techniques frequently struggle to provide precise traceability links. We conduct an empirical study on the Methods2Test dataset, which includes a wide range of test cases and corresponding focal methods, to identify the limitations of current techniques. The causes of inaccurate traceability links are divided into two categories: incorrect extraction and missed extraction. The incorrect extraction category includes method overloading errors, name matching failures, and constructor ignorance. The missed extraction category involves constructor ignorance, similar but non-identical names, and difficulties in locating test classes involving subclasses. Based on the insights from this study, we propose COACH (COmbine trACe Heuristics), an automated approach for establishing test-to-code traceability links. COACH establishes file-level and class-level links as a foundation, integrates multiple heuristics, and defines their scopes to build method-level links, improving both applicability and precision. We evaluate COACH against the M2T method (used in Methods2Test), the NC and LCBA baselines. Experimental results show that COACH outperforms other methods in terms of precision, coverage, and efficiency. In addition, we apply COACH to 19,518 real-world Java projects, creating a novel large-scale dataset called TRACETS4J (TRACE TeSt for Java). Models fine-tuned on TRACETS4J outperform those trained on the Methods2Test dataset, as shown by BLEU-4 and CodeBLEU. This demonstrates the superiority of TRACETS4J and the effectiveness of COACH in improving test case generation.

Xuancheng Jin

Zhuang Liu

Zhejiang University

China

Junwei Zhang

Zhejiang University

Xing Hu

Zhejiang University

China

Xin Xia

Huawei

China

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 7 Mar
Displayed time zone: Eastern Time (US & Canada) change

15:30 - 17:00	Software Testing & DebuggingResearch Papers / Early Research Achievement (ERA) Track / Registered Report Track at L-1720 Chair(s): Gilberto Recupito University of Salerno

15:30 15m Talk		Data Preparation for Fairness-Performance Trade-Offs: A Practitioner-Friendly Alternative? Registered Report Track Gianmario Voria University of Salerno, Rebecca Di Matteo , Giammaria Giordano University of Salerno, Gemma Catolino University of Salerno, Fabio Palomba University of Salerno
15:45 15m Talk		TRACETS4J: A Traceable Unit Test Generation Dataset Research Papers Xuancheng Jin , Zhuang Liu Zhejiang University, Junwei Zhang Zhejiang University, Xing Hu Zhejiang University, Xin Xia Huawei
16:00 15m Talk		Distinguishability-guided Test Program Generation for WebAssembly Runtime Performance Testing Research Papers Shuyao Jiang The Chinese University of Hong Kong, Ruiying Zeng Fudan University, Yangfan Zhou Fudan University, Michael Lyu The Chinese University of Hong Kong Pre-print
16:15 7m Talk		Quantum Testing in the Wild: A Case Study with Qiskit-Algorithms Early Research Achievement (ERA) Track Neilson Carlos Leite Ramalho Universidade de São Paulo, Erico Augusto Da Silva Universidade de São Paulo, Higor Amario de Souza São Paulo State University, Marcos Lordello Chaim