Change And Cover: Last-Mile, Pull Request-Based Regression Test Augmentation
Software is in constant evolution, with developers frequently submitting pull requests (PRs) to introduce new features or fix bugs. Testing newly added or modified code in PRs is critical to maintaining software quality. Yet, even in projects with extensive test suites, some of the code modified in PRs may remain untested, leaving a “last-mile” regression test gap. Existing automated test generators mostly focus on improving overall code coverage, but do not specifically target the uncovered lines in PRs. This paper presents Change And Cover (ChaCo), a novel, LLM-based test augmentation technique that specifically addresses the last-mile regression test gap in PRs. Our approach is enabled by three key contributions: (i) Instead of focusing on overall code coverage, ChaCo considers a specific PR and the lines left uncovered after applying the PR, offering developers augmented tests for code just when it is on the developers’ mind. (ii) We identify providing suitable test context as a crucial challenge for an LLM to generate useful tests, and present two techniques to extract relevant test content, such as existing test functions, fixtures, and data generators. (iii) To make augmented tests acceptable for developers, ChaCo carefully integrates them into the existing test suite, e.g., by matching the test’s structure and style with the existing tests, and generates a summary of the test addition for developer review. We evaluate ChaCo on 145 PRs from three popular, complex, and well-tested open-source projects—SciPy, Qiskit, and Pandas. The approach successfully helps 30% of PRs achieve full patch coverage, at the affordable cost of $0.11 per PR, demonstrating its effectiveness and feasibility. A qualitative assessment of the generated tests shows that human reviewers find the tests to be worth adding (4.53/5.0), well integrated (4.20/5.0), and relevant to the PR (4.70/5.0). In a contribution study, we submitted 12 tests to these projects, of which 8 have already been merged, and two previously unknown bugs were discovered and fixed. We envision our approach to be integrated into CI workflows, automating the last mile of regression test augmentation.
Fri 17 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 17:30 | AI for Software Engineering 29Journal-first Papers / Research Track at Oceania IX Chair(s): Tien N. Nguyen University of Texas at Dallas | ||
16:00 15mTalk | Learning Program Behavioral Models from Synthesized Input-Output Pairs Journal-first Papers Tural Mammadov CISPA Helmholtz Center for Information Security, Dietrich Klakow Saarland University, Alexander Koller Saarland University, Andreas Zeller CISPA Helmholtz Center for Information Security | ||
16:15 15mTalk | MeDeT: Medical Device Digital Twins Creation with Few-shot Meta-learning Journal-first Papers Hassan Sartaj Simula Research Laboratory, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Julie Marie Gjøby Welfare Technologies Section, Oslo Kommune Helseetaten | ||
16:30 15mTalk | Change And Cover: Last-Mile, Pull Request-Based Regression Test Augmentation Research Track Zitong Zhou UCLA, Matteo Paltenghi University of Stuttgart, Miryung Kim UCLA and Amazon Web Services, Michael Pradel CISPA Helmholtz Center for Information Security Link to publication Media Attached | ||
16:45 15mTalk | HarnessLLM: Rust Verification Harness Generation with Large Language Models Research Track | ||
17:00 15mTalk | Agentic Predicates Reasoning for Directed Fuzzing Research Track Jie Zhu University of Chicago, Chihao Shen University of Maryland, Ziyang Li Johns Hopkins University, Jiahao Yu Northwestern University, Yizheng Chen University of Maryland, Kexin Pei The University of Chicago Pre-print | ||
17:15 15mTalk | Relax with Capybaras Research Track Media Attached | ||