Doc2OracLL: Investigating the Impact of Documentation on LLM-based Test Oracle Generation
Code documentation is a critical aspect of software development, serving as a bridge between human understanding and machine-readable code. Beyond assisting developers in understanding and maintaining code, documentation also plays a critical role in automating various software engineering tasks, such as test oracle generation (TOG). In Java, Javadoc comments provide structured, natural language documentation embedded directly in the source code, typically detailing functionality, usage, parameters, return values, and exceptions. While prior research has utilized Javadoc comments in test oracle generation (TOG), there has not been a thorough investigation into their impact when combined with other contextual information, nor into identifying the most relevant components for generating correct and strong test oracles, or understanding their role in detecting real bugs. In this study, we dive deep into investigating the impact of Javadoc comments on TOG. We start by fine-tune 10 large language models with three different prompt pairs designed to investigate the impact of Javadoc comments when using with other contextual information. We conduct a systematic analysis to assess the impact of different Javadoc components on TOG. For investigating the generalizability of the Javadoc comments from various sources, we also generate Javadoc comments using GPT-3.5 model. Finally, we perform a thorough bug detection study using Defects4J to understand the role of Javadoc comments in real-world bug detection. Our results show that, in most cases, incorporating Javadoc comments improves the accuracy of test oracles, aligning closely with ground truth. We found that Javadoc comments alone can nearly match the performance achieved when using both Javadoc comments and MUT code. We find that the description and the return tags of the Javadoc comments are most valuable in TOG. Finally, when using just Javadoc comments our method detects between 19% and 94% more real-world bugs in Defects4J than prior methods.
Mon 23 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:30 | Test GenerationResearch Papers / Industry Papers at Cosmos Hall Chair(s): Michael Pradel University of Stuttgart | ||
10:30 20mTalk | CoverUp: Effective High Coverage Test Generation for Python Research Papers Juan Altmayer Pizzorno University of Massachusetts Amherst, Emery D. Berger University of Massachusetts Amherst and Amazon Web Services DOI Pre-print | ||
11:00 20mTalk | Doc2OracLL: Investigating the Impact of Documentation on LLM-based Test Oracle Generation Research Papers Soneya Binta Hossain University of Virginia, Raygan Taylor Dillard University, Matthew B Dwyer University of Virginia DOI | ||
11:20 20mTalk | Less is More: On the Importance of Data Quality for Unit Test Generation Research Papers Junwei Zhang Zhejiang University, Xing Hu Zhejiang University, Shan Gao Huawei, Xin Xia Zhejiang University, David Lo Singapore Management University, Shanping Li Zhejiang University DOI | ||
11:40 20mTalk | Mutation-Guided LLM-based Test Generation at Meta Industry Papers Mark Harman Meta Platforms, Inc. and UCL, Jillian Ritchey Meta platforms, Inna Harper Meta, Shubho Sengupta Meta platforms, Ke Mao Meta, Abhishek Gulati Meta platforms, Christopher Foster Meta platforms, Hervé Robert Meta platforms | ||
12:00 10mTalk | LSPAI: An IDE Plugin for LLM-Powered Multi-Language Unit Test Generation with Language Server Protocol Industry Papers Gwihwan Go Tsinghua University, Chijin Zhou Tsinghua University, Quan Zhang Tsinghua University, Yu Jiang Tsinghua University, Zhao Wei Tencent | ||
12:10 20mTalk | Can Generative AI Produce Test Cases? An Experience from the Automotive Domain Industry Papers Stephen Wynn-Williams McMaster University, Canada, Ryan Tyrrell McMaster University, Vera Pantelic McMaster University, Mark Lawford McMaster University, Claudio Menghi University of Bergamo; McMaster University, Phaneendra Nalla FCA US LLC, Hassan Artail FCA US LLC |
This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.
The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.