ESEIW 2025
Sun 28 September - Fri 3 October 2025

This program is tentative and subject to change.

Fri 3 Oct 2025 14:16 - 14:32 at Kaiulani II - Software Testing Chair(s): Márcio Ribeiro

Large language models (LLM) suffer from various forms of biases due to the biased data sets used to train the models. At the same time, human cognitive biases have an equal propensity to express themselves when using LLMs for software engineering tasks. Software testing is a critical phase of the software development life cycle. Confirmation bias is reported to have deteriorated software testing by designing more specification-consistent test cases compared to specification-inconsistent test cases. However, there is a lack of debiasing (mitigation) strategies in this regard. In this paper, we initially present preliminary experimental evidence for the manifestation of confirmation bias by an LLM - Chat-GPT4.0 in the designing of functional test cases. Subsequently, we present a vision for debiasing confirmation bias in functional software testing by leveraging LLMs via a multi-agent approach. The proposed vision may improve the performance of LLMs in terms of reduced confirmation bias and serve as a debiasing technique for functional software testing.

This program is tentative and subject to change.

Fri 3 Oct

Displayed time zone: Hawaii change

14:00 - 15:20
14:00
16m
Talk
An Empirical Investigation into Maintenance of Load Testing Scripts
ESEM - Emerging Results and Vision Track
Ibuki Nakamura Nara Institute of Science and Technology, Kosei Horikawa Nara Institute of Science and Technology, Brittany Reid Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Hajimu Iida Nara Institute of Science and Technology
14:16
16m
Talk
A Vision for Debiasing Confirmation Bias in Software Testing via LLM
ESEM - Emerging Results and Vision Track
Iflaah Salman Lappeenranta-Lahti University of Technology (LUT), Muhammad Waseem Faculty of Information Technology and Communication Sciences, Tampere University, 33014 Tampere, Finland, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad, Rasanjana Dhanushkha De Alwis Lappeenranta-Lahti University of Technology LUT
14:32
16m
Talk
Comparing effectiveness and efficiency of interactive application security testing (IAST) and runtime application self-protection (RASP) tools in a large java-based system
ESEM - Journal First Track
Aishwwarya Seth Microsoft, Saikath Bhattacharya Illinois State University, Sarah Elder UNC-Wilmington, Nusrat Zahan North Carolina State University, Laurie Williams North Carolina State University
14:48
16m
Talk
Is Diversity a Meaningful Metric in Fairness Testing?
ESEM - Technical Track
Kazuki Funamoto Keio University, Takashi Kitamura AIST, Shingo Takada Keio University, Japan
15:04
16m
Talk
Where Tests Fall Short: Empirically Analyzing Oracle Gaps in Covered Code
ESEM - Technical Track
Megan Maton University of Sheffield, Gregory Kapfhammer Allegheny College, Phil McMinn University of Sheffield