Contextual Code Retrieval for Commit Message Generation: A Preliminary Study
This program is tentative and subject to change.
A commit message describes the main code changes in a commit and plays a crucial role in software maintenance. Existing commit message generation (CMG) approaches typically frame it as a direct mapping that takes a code diff as input and produces a brief descriptive sentence as output. However, we argue that relying solely on the code diff is insufficient, as raw code diff fails to capture the full context needed for generating high-quality and informative commit messages.
In this paper, we propose a contextual code retrieval based method called C3Gen to enhance CMG by retrieving commit-relevant code snippets from the repository and incorporating them into the model input to provide richer contextual information at the repository scope. In the experiments, we evaluated the effectiveness of C3Gen across various models using four objective and three subjective metrics. Meanwhile, we design and conduct a human evaluation to investigate how C3Gen-generated commit messages are perceived by human developers. The results show that by incorporating contextual code into the input, C3Gen enables models to effectively leverage additional information to generate more comprehensive and informative commit messages with greater practical value in real-world development scenarios. Further analysis underscores concerns about the reliability of similarity-based metrics and provides empirical insights for CMG.
This program is tentative and subject to change.
Fri 3 OctDisplayed time zone: Hawaii change
14:00 - 15:20 | LLMs for Code Generation, Translation, and MaintainabilityESEM - Technical Track / ESEM - Emerging Results and Vision Track / at Kaiulani I Chair(s): Ivan Machado Federal University of Bahia - UFBA | ||
14:00 20mTalk | A Fully Automated Agent for End-to-End Code Translation and Validation ESEM - Emerging Results and Vision Track Eray Erer Boğaziçi University, Ayşe Başar Ryerson University, Aysun Bozanta Bogazici University, Turgay Aytac Comunale Capital | ||
14:20 20mTalk | Contextual Code Retrieval for Commit Message Generation: A Preliminary Study ESEM - Emerging Results and Vision Track Bo Xiong Wuhan University, Linghao Zhang Wuhan University, Chong Wang Wuhan University, Peng Liang Wuhan University, China Pre-print | ||
14:40 20mTalk | How Small is Enough? Empirical Evidence of Quantized Small Language Models for Automated Program Repair ESEM - Emerging Results and Vision Track Kazuki Kusama , Honglin Shu Kyushu University, Masanari Kondo Kyushu University, Yasutaka Kamei Kyushu University | ||
15:00 20mTalk | Is LLM-Generated Code More Maintainable & Reliable than Human-Written Code? ESEM - Technical Track Alfred Santa Molison Toronto Metropolitan University, Fabio Marcos De Abreu Santos Colorado State University, USA, Marcia Moraes Colorado State University, Glaucia Melo Toronto Metropolitan University, Wesley Assunção North Carolina State University |