SPICE : An Automated SWE-Bench Labeling Pipeline for Issue Clarity, Test Coverage, and Effort Estimation (ASE 2025 - Research Papers)

Who

Aaditya Bhatia, Gustavo Oliva, Gopi Krishnan Rajbahadur, Haoxiang Zhang, Yihao Chen, Zhilong Chen, Arthur Leung, Dayi Lin, Boyuan Chen, Ahmed E. Hassan

Track

ASE 2025 Research Papers

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 17 Nov 2025 14:20 - 14:30 at Grand Hall 5 - Software Process

Abstract

High-quality labeled datasets are crucial for training and evaluating foundation models in software engineering, but creating them is often prohibitively expensive and labor-intensive. We introduce SPICE, a scalable, automated pipeline for labeling SWE-bench-style datasets with annotations for issue clarity, test coverage, and effort estimation. SPICE combines context-aware code navigation, rationale-driven prompting, and multi-pass consensus to produce labels that closely approximate expert annotations. SPICE’s design was informed by our own experience and frustration in labeling more than 800 tasks from SWE-Gym. SPICE achieves strong agreement with human-labeled SWE-bench Verified data while reducing the cost of labeling 1,000 instances from around $100,000 (manual annotation) to just $5.10. These results demonstrate SPICE’s potential to enable cost-effective, large-scale dataset creation for SE-focused FMs.

Aaditya Bhatia

Queen's University

Gustavo Oliva

Centre for Software Excellence, Huawei Canada

Gopi Krishnan Rajbahadur

Centre for Software Excellence, Huawei, Canada

Haoxiang Zhang

Huawei

Canada

Yihao Chen

Center for Software Excellence, Huawei Canada

Zhilong Chen

Center for Software Excellence, Huawei Canada

Arthur Leung

Center for Software Excellence, Huawei Canada

Dayi Lin

Centre for Software Excellence, Huawei Canada

Canada

Boyuan Chen

Centre for Software Excellence, Huawei Canada

Ahmed E. Hassan

Queen’s University

Canada

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 17 Nov
Displayed time zone: Seoul change

14:00 - 15:30	Software ProcessResearch Papers / Journal-First Track at Grand Hall 5

14:00 10m Talk		LAURA: Enhancing Code Review Generation with Context-Enriched Retrieval-Augmented LLM Research Papers Yuxin Zhang Beijing Institute of Technology, Yuxia Zhang Beijing Institute of Technology, Zeyu Sun Institute of Software, Chinese Academy of Sciences, Yanjie Jiang Peking University, Hui Liu Beijing Institute of Technology
14:10 10m Talk		AlertGuardian: Intelligent Alert Life-Cycle Management for Large-scale Cloud Systems Research Papers Guangba Yu The Chinese University of Hong Kong, Genting Mai Sun Yat-sen University, Rui Wang Tencent, Ruipeng Li Tencent, Pengfei Chen Sun Yat-sen University, Long Pan Tencent, Ruijie Xu Tencent
14:20 10m Talk		SPICE : An Automated SWE-Bench Labeling Pipeline for Issue Clarity, Test Coverage, and Effort Estimation Research Papers Aaditya Bhatia Queen's University, Gustavo Oliva Centre for Software Excellence, Huawei Canada, Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada, Haoxiang Zhang Huawei, Yihao Chen Center for Software Excellence, Huawei Canada, Zhilong Chen Center for Software Excellence, Huawei Canada, Arthur Leung Center for Software Excellence, Huawei Canada, Dayi Lin Centre for Software Excellence, Huawei Canada, Boyuan Chen Centre for Software Excellence, Huawei Canada, Ahmed E. Hassan Queen’s University
14:30 10m Talk		Managing the variability of a logistics robotic system Journal-First Track Kentaro Yoshimura Hitachi, Ltd., Yuta Yamauchi Hitachi, Ltd., Hideo Takahashi Hitachi, Ltd.
14:40 10m Talk		Sprint2Vec: A Deep Characterization of Sprints in Iterative Software Development Journal-First Track Morakot Choetkiertikul Mahidol University, Thailand, Peerachai Banyongrakkul Mahidol University, Chaiyong Rakhitwetsagul Mahidol University, Thailand, Suppawong Tuarob Mahidol University, Hoa Khanh Dam University of Wollongong, Thanwadee Sunetnanta Mahidol University
14:50 10m Talk		Supporting Emotional Intelligence, Productivity and Team Goals while Handling Software Requirements Changes Journal-First Track Kashumi Madampe Monash University, Australia, Rashina Hoda Monash University, John Grundy Monash University
15:00 10m Talk		Rechecking Recheck Requests in Continuous Integration: An Empirical Study of OpenStack Research Papers Yelizaveta Brus University of Waterloo, Rungroj Maipradit University of Waterloo, Earl T. Barr University College London, Shane McIntosh University of Waterloo
15:10 10m Talk		An LLM-based multi-agent framework for agile effort estimation Research Papers Long Bui University of Wollongong, Hoa Khanh Dam University of Wollongong, Rashina Hoda Monash University
15:20 10m Talk		From Characters to Structure: Rethinking Real-Time Collaborative Programming Models Research Papers Leon Freudenthaler FH Campus Wien, Bernhard Taufner FH Campus Wien, Karl M. Göschka TU Wien