Unit Test Update through LLM-Driven Context Collection and Error-Type-Aware Refinement
This program is tentative and subject to change.
Unit testing is critical for ensuring software quality and software system stability. The current practice of manually maintaining unit tests suffers from low efficiency and the risk of delayed or overlooked fixes. Therefore, an automated approach is required to instantly update unit tests, with the capability to both repair and enhance unit tests. However, existing automated test maintenance methods primarily focus on repairing broken tests, neglecting the scenario of enhancing existing tests to verify new functionality. Meanwhile, due to their reliance on rule-based context collection and the lack of verification mechanisms, existing approaches struggle to handle complex code changes and often produce test cases with low correctness. To address these challenges, we propose TestUpdater, a novel Large Language Model (LLM) based approach that enables automated just-in-time test updates in response to production code changes. By emulating the reasoning process of developers, TestUpdater first leverages the LLM to analyze code changes and identify relevant context, which it then extracts and filters. This LLM-driven context collector can flexibly gather accurate and sufficient context, enabling better handling of complex code changes. Then, through carefully designed prompts, TestUpdater guides the LLM step by step to handle various types of code changes and introduce new dependencies, enabling both the repair of broken tests and the enhancement of tests. Finally, emulating the debugging process, we introduce an error-type-aware iterative refinement mechanism that executes the LLM-updated tests and repairs failures, which significantly improves the overall correctness of test updates. Since existing test repair datasets lack scenarios of test enhancement, we further construct a new benchmark, Updates4J, with 195 real-world samples from 7 projects, enabling execution-based evaluation of test updates. Experimental results show that TestUpdater achieves a compilation pass rate of 94.4% and a test pass rate of 84.6%, outperforming the state-of-the-art method Synter by 15.4% and 16.9%, respectively. Furthermore, TestUpdater exhibits 12.5% higher branch coverage and 14.1% greater line coverage than Synter.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
14:00 - 15:30 | |||
14:00 10mTalk | Mokav: Execution-driven Differential Testing with LLMs Journal-First Track Khashayar Etemadi ETH Zurich, Bardia Mohammadi Sharif University of Technology, Zhendong Su ETH Zurich, Martin Monperrus KTH Royal Institute of Technology | ||
14:10 10mTalk | Validity-Preserving Delta Debugging via Generator Trace Reduction Journal-First Track Luyao Ren Peking University, Xing Zhang Peking University, Ziyue Hua Peking University, Yanyan Jiang Nanjing University, Xiao He Bytedance, Yingfei Xiong Peking University, Tao Xie Peking University | ||
14:20 10mTalk | Execution-Aware Program Reduction for WebAssembly via Record and Replay Research Papers Doehyun Baek University of Stuttgart, Daniel Lehmann Google, Germany, Ben L. Titzer Carnegie Mellon University, Sukyoung Ryu KAIST, Michael Pradel CISPA Helmholtz Center for Information Security | ||
14:30 10mTalk | DebCovDiff: Differential Testing of Coverage Measurement Tools on Real-World Projects Research Papers Wentao Zhang University of Illinois Urbana-Champaign, Jinghao Jia University of Illinois Urbana-Champaign, Erkai Yu University of Illinois Urbana-Champaign, Darko Marinov University of Illinois at Urbana-Champaign, Tianyin Xu University of Illinois at Urbana-Champaign Media Attached | ||
14:40 10mTalk | DRIFT: Debug-based Trace Inference for Firmware Testing Research Papers Changming Liu Northeastern University, Alejandro Mera Northeastern University, Meng Xu University of Waterloo, Engin Kirda Northeastern University | ||
14:50 10mTalk | Enhancing Differential Testing With LLMs For Testing Deep Learning Libraries Journal-First Track Meiziniu LI The Hong Kong University of Science and Technology, Dongze Li The Hong Kong University of Science and Technology, Jianmeng Liu The Hong Kong University of Science and Technology, Jialun Cao Hong Kong University of Science and Technology, Yongqiang Tian Monash University, Shing-Chi Cheung Hong Kong University of Science and Technology | ||
15:00 10mTalk | Unit Test Update through LLM-Driven Context Collection and Error-Type-Aware Refinement Research Papers Yuanhe Zhang Zhejiang University, Zhiquan Yang Zhejiang University, Shengyi Pan Zhejiang University, Zhongxin Liu Zhejiang University | ||
15:10 10mTalk | Metamorphic Testing for Audio Content Moderation Software Research Papers Wenxuan Wang Hong Kong University of Science and Technology, Yongjiang Wu The Chinese University of Hong Kong, Junyuan Zhang The Chinese University of Hong Kong, Shuqing Li The Chinese University of Hong Kong, Yun Peng The Chinese University of Hong Kong, Wenting Chen City University of Hong Kong, Shuai Wang Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong | ||
15:20 10mTalk | Comprehend, Imitate, and then Update: Unleashing the Power of LLMs in Test Suite Evolution Research Papers Tangzhi Xu Nanjing University, Jianhan Liu Nanjing University, Yuan Yao Nanjing University, Cong Li ETH Zurich, Feng Xu Nanjing University, Xiaoxing Ma Nanjing University | ||