A Vision for Debiasing Confirmation Bias in Software Testing via LLM (ESEIW 2025 - ESEM - Emerging Results and Vision Track )

Who

Iflaah Salman, Muhammad Waseem, Vladimir Mandić, Rasanjana Dhanushkha De Alwis

Track

ESEIW 2025 ESEM - Emerging Results and Vision Track

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 3 Oct 2025 14:16 - 14:32 at Kaiulani II - Software Testing Chair(s): Márcio Ribeiro

Abstract

Large language models (LLM) suffer from various forms of biases due to the biased data sets used to train the models. At the same time, human cognitive biases have an equal propensity to express themselves when using LLMs for software engineering tasks. Software testing is a critical phase of the software development life cycle. Confirmation bias is reported to have deteriorated software testing by designing more specification-consistent test cases compared to specification-inconsistent test cases. However, there is a lack of debiasing (mitigation) strategies in this regard. In this paper, we initially present preliminary experimental evidence for the manifestation of confirmation bias by an LLM - Chat-GPT4.0 in the designing of functional test cases. Subsequently, we present a vision for debiasing confirmation bias in functional software testing by leveraging LLMs via a multi-agent approach. The proposed vision may improve the performance of LLMs in terms of reduced confirmation bias and serve as a debiasing technique for functional software testing.

Iflaah Salman

Lappeenranta-Lahti University of Technology (LUT)

Finland

Muhammad Waseem

Faculty of Information Technology and Communication Sciences, Tampere University, 33014 Tampere, Finland

Finland

Vladimir Mandić

Faculty of Technical Sciences, University of Novi Sad

Serbia

Rasanjana Dhanushkha De Alwis

Lappeenranta-Lahti University of Technology LUT

Finland

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 3 Oct
Displayed time zone: Hawaii change

14:00 - 15:20	Software TestingESEM - Emerging Results and Vision Track / ESEM - Journal First Track / ESEM - Technical Track / at Kaiulani II Chair(s): Márcio Ribeiro Federal University of Alagoas, Brazil

14:00 16m Talk		An Empirical Investigation into Maintenance of Load Testing Scripts ESEM - Emerging Results and Vision Track Ibuki Nakamura Nara Institute of Science and Technology, Kosei Horikawa Nara Institute of Science and Technology, Brittany Reid Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Hajimu Iida Nara Institute of Science and Technology
14:16 16m Talk		A Vision for Debiasing Confirmation Bias in Software Testing via LLM ESEM - Emerging Results and Vision Track Iflaah Salman Lappeenranta-Lahti University of Technology (LUT), Muhammad Waseem Faculty of Information Technology and Communication Sciences, Tampere University, 33014 Tampere, Finland, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad, Rasanjana Dhanushkha De Alwis Lappeenranta-Lahti University of Technology LUT
14:32 16m Talk		Comparing effectiveness and efficiency of interactive application security testing (IAST) and runtime application self-protection (RASP) tools in a large java-based system ESEM - Journal First Track Aishwwarya Seth Microsoft, Saikath Bhattacharya Illinois State University, Sarah Elder UNC-Wilmington, Nusrat Zahan North Carolina State University, Laurie Williams North Carolina State University
14:48 16m Talk		Is Diversity a Meaningful Metric in Fairness Testing? ESEM - Technical Track Kazuki Funamoto Keio University, Takashi Kitamura AIST, Shingo Takada Keio University, Japan
15:04 16m Talk		Where Tests Fall Short: Empirically Analyzing Oracle Gaps in Covered Code ESEM - Technical Track Megan Maton University of Sheffield, Gregory Kapfhammer Allegheny College, Phil McMinn University of Sheffield