A Vision for Debiasing Confirmation Bias in Software Testing via LLM
This program is tentative and subject to change.
Large language models (LLM) suffer from various forms of biases due to the biased data sets used to train the models. At the same time, human cognitive biases have an equal propensity to express themselves when using LLMs for software engineering tasks. Software testing is a critical phase of the software development life cycle. Confirmation bias is reported to have deteriorated software testing by designing more specification-consistent test cases compared to specification-inconsistent test cases. However, there is a lack of debiasing (mitigation) strategies in this regard. In this paper, we initially present preliminary experimental evidence for the manifestation of confirmation bias by an LLM - Chat-GPT4.0 in the designing of functional test cases. Subsequently, we present a vision for debiasing confirmation bias in functional software testing by leveraging LLMs via a multi-agent approach. The proposed vision may improve the performance of LLMs in terms of reduced confirmation bias and serve as a debiasing technique for functional software testing.
This program is tentative and subject to change.
Fri 3 OctDisplayed time zone: Hawaii change
14:00 - 15:20 | Software TestingESEM - Emerging Results and Vision Track / ESEM - Journal First Track / ESEM - Technical Track at Kaiulani II Chair(s): Márcio Ribeiro Federal University of Alagoas, Brazil | ||
14:00 16mTalk | An Empirical Investigation into Maintenance of Load Testing Scripts ESEM - Emerging Results and Vision Track Ibuki Nakamura Nara Institute of Science and Technology, Kosei Horikawa Nara Institute of Science and Technology, Brittany Reid Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Hajimu Iida Nara Institute of Science and Technology | ||
14:16 16mTalk | A Vision for Debiasing Confirmation Bias in Software Testing via LLM ESEM - Emerging Results and Vision Track Iflaah Salman Lappeenranta-Lahti University of Technology (LUT), Muhammad Waseem Faculty of Information Technology and Communication Sciences, Tampere University, 33014 Tampere, Finland, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad, Rasanjana Dhanushkha De Alwis Lappeenranta-Lahti University of Technology LUT | ||
14:32 16mTalk | Comparing effectiveness and efficiency of interactive application security testing (IAST) and runtime application self-protection (RASP) tools in a large java-based system ESEM - Journal First Track Aishwwarya Seth Microsoft, Saikath Bhattacharya Illinois State University, Sarah Elder UNC-Wilmington, Nusrat Zahan North Carolina State University, Laurie Williams North Carolina State University | ||
14:48 16mTalk | Is Diversity a Meaningful Metric in Fairness Testing? ESEM - Technical Track | ||
15:04 16mTalk | Where Tests Fall Short: Empirically Analyzing Oracle Gaps in Covered Code ESEM - Technical Track Megan Maton University of Sheffield, Gregory Kapfhammer Allegheny College, Phil McMinn University of Sheffield |