MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning
Program repair requires reasoning about why a change is correct, not only recognizing edit patterns. Fine-tuning approaches proposed in the literature for LLMs on program repair tasks generally overlook the need to reason about the logic behind code changes, beyond syntactic patterns in the data. High-performing fine-tuning experiments also typically incur very high computational costs. With MORepair, we propose a novel perspective on the learning focus of LLM fine-tuning for program repair: we not only adapt the LLM parameters to the syntactic nuances of the task of code transformation (objective ①), but we also specifically fine-tune the LLM with respect to the logical reason behind the code change in the training data (objective ②). Such a multi-objective fine-tuning will instruct LLMs to generate high-quality patches. The workflow of MORepair consists of three phases: preparing guidance from paired buggy and fixed code, fine-tuning lightweight adapters with QLoRA under a joint loss, and inference that produces candidate patches verified by tests.
We apply MORepair to fine-tune four open-source LLMs with different sizes and architectures. Experimental results on function-level and repository-level repair benchmarks demonstrate that the implemented fine-tuning effectively enhances LLM repair performance by 11.4% to 56.0%. We further demonstrate that our fine-tuning strategy achieves superior performance compared to state-of-the-art approaches, including standard fine-tuning, Fine-tune-CoT, and RepairLLaMA.
Overall, this paper makes the following contributions:
[Approach]: MORepair uses multi-objective fine-tuning to couple patch generation with guidance that captures the repair rationale, enabling higher quality patches.
[Benchmarks]: We build EvalRepair-C++ and EvalRepair-Java with 164 and 163 items, derived from HumanEval-X and HumanEval-Java, plus augmented tests to reduce patch overfitting. We also provide D4J-Repair (371 Java bugs derived from Defects4J) and SWE-Repair (204 Python bugs derived from SWE-bench).
[Experiments and insights]: Across models and languages, MORepair consistently exceeds prior methods. Ablations show each single objective is suboptimal; the joint objective yields the best accuracy and more logically consistent fixes.
[Artifacts]: Our research artifacts, including code and the reproduction data, are publicly available at: https://github.com/buaabarty/morepair.
Mon 17 NovDisplayed time zone: Seoul change
11:00 - 12:30 | |||
11:00 10mTalk | Defects4C: Benchmarking Large Language Model Repair Capability with C/C++ Bugs Research Papers Jian Wang Nanyang Technological University, Xiaofei Xie Singapore Management University, Qiang Hu Tianjin University, Shangqing Liu Nanjing University, Jiongchi Yu Singapore Management University, Jiaolong Kong Singapore Management University, Yi Li Nanyang Technological University Pre-print | ||
11:10 10mTalk | MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning Journal-First Boyang Yang Yanshan University, Haoye Tian Aalto University, Jiadong Ren Yanshan University, Hongyu Zhang Chongqing University, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Claire Le Goues Carnegie Mellon University, Shunfu Jin Yanshan University Link to publication DOI Pre-print | ||
11:20 10mTalk | Test-based Patch Clustering for Automatically-Generated Patches Assessment Journal-First Matias Martinez Universitat Politècnica de Catalunya (UPC), Maria Kechagia National and Kapodistrian University of Athens, Anjana Perera Oracle Labs, Australia, Justyna Petke University College London, Federica Sarro University College London, Aldeida Aleti Monash University | ||
11:30 10mTalk | Hierarchical Knowledge Injection for Improving LLM-based Program Repair Research Papers Ramtin Ehsani Drexel University, Esteban Parra Rodriguez Belmont University, Sonia Haiduc Florida State University, Preetha Chatterjee Drexel University, USA | ||
11:40 10mTalk | Characterizing Multi-Hunk Patches: Divergence, Proximity, and LLM Repair Challenges Research Papers Noor Nashid University of British Columbia, Daniel Ding University of British Columbia, Keheliya Gallaba Centre for Software Excellence, Ahmed E. Hassan Queen’s University, Ali Mesbah University of British Columbia Pre-print | ||
11:50 10mTalk | Reinforcement Learning for Mutation Operator Selection in Automated Program Repair Journal-First Carol Hanna University College London, Aymeric Blot University of Rennes, IRISA / INRIA, Justyna Petke University College London | ||
12:00 10mTalk | Seeing is Fixing: Cross-Modal Reasoning with Multimodal LLMs for Visual Software Issue Repair Research Papers Kai Huang Technical University of Munich, Jian Zhang Nanyang Technological University, Xiaofei Xie Singapore Management University, Chunyang Chen TU Munich | ||