TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Thu 1 May 2025 12:15 - 12:30 at 214 - AI for Testing and QA 3 Chair(s): Mike Papadakis

Test oracles play a crucial role in software testing, enabling effective bug detection. Despite initial promise, neural methods for automated test oracle generation often result in a large number of false positives and weaker test oracles. While LLMs have shown impressive effectiveness in various software engineering tasks, including code generation, test case creation, and bug fixing, there remains a notable absence of large-scale studies exploring their effectiveness in test oracle generation. The question of whether LLMs can address the challenges in effective oracle generation is both compelling and requires thorough investigation.

In this research, we present the first comprehensive study to investigate the capabilities of LLMs in generating correct, diverse, and strong test oracles capable of effectively identifying a large number of unique bugs. To this end, we fine-tuned seven code LLMs using six distinct prompts on a large dataset consisting of 110 Java projects. Utilizing the most effective fine- tuned LLM and prompt pair, we introduce TOGLL, a novel LLM-based method for test oracle generation. To investigate the generalizability of TOGLL, we conduct studies on 25 unseen large-scale Java projects. Besides assessing the correctness, we also assess the diversity and strength of the generated oracles. We compare the results against EvoSuite and the state-of-the-art neural method, TOGA. Our findings reveal that TOGLL can produce 3.8 times more correct assertion oracles and 4.9 times more exception oracles. Regarding bug detection effectiveness, TOGLL can detect 1,023 unique mutants that EvoSuite cannot, which is ten times more than what the previous SOTA neural-based method, TOGA, can detect. Additionally, TOGLL significantly outperforms TOGA in detecting real bugs from the Defects4J dataset.

This program is tentative and subject to change.

Thu 1 May

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
AI for Testing and QA 3Research Track at 214
Chair(s): Mike Papadakis University of Luxembourg
11:00
15m
Talk
A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs
Research Track
Myeongsoo Kim Georgia Institute of Technology, Tyler Stennett Georgia Institute of Technology, Saurabh Sinha IBM Research, Alessandro Orso Georgia Institute of Technology
11:15
15m
Talk
ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real ProgramsArtifact-Available
Research Track
Hongyan Gao State Key Laboratory for Novel Software Technology, Nanjing University, Yibiao Yang Nanjing University, Maolin Sun Nanjing University, Jiangchang Wu State Key Laboratory for Novel Software Technology, Nanjing University, Yuming Zhou Nanjing University, Baowen Xu State Key Laboratory for Novel Software Technology, Nanjing University
11:30
15m
Talk
LLM Based Input Space Partitioning Testing for Library APIsArtifact-FunctionalArtifact-Available
Research Track
Jiageng Li Fudan University, Zhen Dong Fudan University, Chong Wang Nanyang Technological University, Haozhen You Fudan University, Cen Zhang Georgia Institute of Technology, Yang Liu Nanyang Technological University, Xin Peng Fudan University
11:45
15m
Talk
Leveraging Large Language Models for Enhancing the Understandability of Generated Unit TestsArtifact-FunctionalArtifact-Available
Research Track
Amirhossein Deljouyi Delft University of Technology, Roham Koohestani Delft University of Technology, Maliheh Izadi Delft University of Technology, Andy Zaidman Delft University of Technology
12:00
15m
Talk
exLong: Generating Exceptional Behavior Tests with Large Language ModelsArtifact-Available
Research Track
Jiyang Zhang University of Texas at Austin, Yu Liu Meta, Pengyu Nie University of Waterloo, Junyi Jessy Li University of Texas at Austin, USA, Milos Gligoric The University of Texas at Austin
12:15
15m
Talk
TOGLL: Correct and Strong Test Oracle Generation with LLMsArtifact-Available
Research Track
Soneya Binta Hossain University of Virginia, Matthew B Dwyer University of Virginia
:
:
:
: