Multi-dimensional Assessment of CrowdSourced Testing Reports via LLMs (ASE 2025 - Research Papers) - ASE 2025

Sun 16 - Thu 20 November 2025 Seoul, South Korea

Who

Yue Wang, Yuan Zhao, Shengcheng Yu, Zhenyu Chen

Track

ASE 2025 Research Papers

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Mon 17 Nov 2025 15:20 - 15:30 at Grand Hall 4 - Human & Social Aspects 1

Abstract

Crowdsourced testing can markedly enhance test coverage and the discovery rate of potential defects compared to traditional software testing, making it increasingly popular. However, with the widespread use of crowdsourced testing, more and more crowdworkers from various backgrounds are submitting a large number of testing reports to crowdsourced testing platforms, which hinders developers from effectively reviewing the reports. Facing a vast amount of reports with varying quality, manual review is not only time-consuming and labor-intensive but also increases costs. Therefore, how to efficiently review crowdsourced testing reports has become a major challenge. To address this challenge, we propose a multi-dimensional assessment method for crowdsourced testing reports based on large language models. This method not only inherits the textuality dimension widely used in traditional report assessment but also innovatively introduces two new dimensions: adequacy and competitiveness. It comprehensively assesses the quality of crowdsourced testing reports from multiple perspectives, aiming to better screen for high-quality crowdsourced testing reports. Through experimental analysis conducted on three different applications, we have proven the consistency of our method with human raters across various dimensions, and we have also observed an enhancement in the efficiency of report assessment.

Yue Wang

NanJing University

China

Yuan Zhao

Laboratory of Data Intelligence and Interdisciplinary Innovation, Nanjing University

Shengcheng Yu

Technical University of Munich

Germany

Zhenyu Chen

Nanjing University

China

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Mon 17 Nov
Displayed time zone: Seoul change

	14:00 - 15:30	Human & Social Aspects 1Research Papers / Journal-First Track at Grand Hall 4

	14:00 10m Talk		Why AI Agents Still Need You: Findings from Developer-Agent Collaborations in the Wild Research Papers Aayush Kumar Microsoft, Yasharth Bajpai Microsoft, Sumit Gulwani Microsoft, Gustavo Soares Microsoft, Emerson Murphy-Hill Microsoft
	14:10 10m Talk		The Cost of Downgrading Build Systems: A Case Study of Kubernetes Research Papers Gareema Ranjan University of Waterloo, Mahmoud Alfadel University of Calgary, Gengyi Sun University of Waterloo, Shane McIntosh University of Waterloo Pre-print
	14:20 10m Talk		Democratizing the Cryptocurrency Ecosystem by Just-In-Time Transformation of Mining Programs Research Papers Wei Liu Nanjing University, Zhenhua Li Tsinghua University, Feng Qian University of Southern California, Feiyu Jin Tsinghua University, Hao Lin Tsinghua University, Yannan Zheng Ant Group, Bo Xiao Ant Group, Xiaokang Qin Ant Group, Tianyin Xu University of Illinois at Urbana-Champaign
	14:30 10m Talk		Advancing Automated Ethical Profiling in SE: a Zero-Shot Evaluation of LLM Reasoning Research Papers Patrizio Migliarini University of L'Aquila, Italy, Mashal Afzal Memon University of L’Aquila, Italy, Marco Autili University of L'Aquila, Italy, Paola Inverardi Gran Sasso Science Institute Pre-print
	14:40 10m Talk		The Impact of the COVID-19 Pandemic on Women’s Contribution to Public Code Journal-First Track Annalí Casanueva Ifo Institute, Big Data Junior Research Group, Munich, Germany, Davide Rossi University of Bologna, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris, Stefano Zacchiroli LTCI, Télécom Paris, Institut Polytechnique de Paris, Palaiseau, France Link to publication DOI Pre-print
	14:50 10m Talk		Understanding Feature Request Practice on GitHub via a Large-Scale Empirical Study Research Papers Jiajun Li Nanjing University of Aeronautics and Astronautics, Wenhua Yang Nanjing University of Aeronautics and Astronautics, Minxue Pan Nanjing University, Yu Zhou Nanjing University of Aeronautics and Astronautics
	15:00 10m Talk		Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping Research Papers Jingyu Xiao The Chinese University of Hong Kong, Yuxuan Wan The Chinese University of Hong Kong, Yintong Huo Singapore Management University, Singapore, Zixin Wang The Chinese University of Hong Kong, Xinyi Xu The Chinese University of Hong Kong, Wenxuan Wang Hong Kong University of Science and Technology, Zhiyao Xu Tsinghua University, Yuhang Wang Southwest University, Michael Lyu The Chinese University of Hong Kong
	15:10 10m Talk		Engineering Digital Systems for Humanity: a Research Roadmap Journal-First Track Marco Autili University of L'Aquila, Italy, Martina De Sanctis Gran Sasso Science Institute, Paola Inverardi Gran Sasso Science Institute, Patrizio Pelliccione Gran Sasso Science Institute, L'Aquila, Italy
	15:20 10m Talk		Multi-dimensional Assessment of CrowdSourced Testing Reports via LLMs Research Papers Yue Wang NanJing University, Yuan Zhao Laboratory of Data Intelligence and Interdisciplinary Innovation, Nanjing University, Shengcheng Yu Technical University of Munich, Zhenyu Chen Nanjing University