When Retriever Meets Generator: A Joint Model for Code Comment Generation
This program is tentative and subject to change.
Automatically generating concise, informative comments for source code can lighten documentation effort and accelerate program comprehension. Retrieval-augmented approaches first fetch code snippets with existing comments and then synthesize a new comment, yet retrieval and generation are typically optimized in isolation, allowing irrelevant neighbors to propagate noise downstream. To tackle the issue, we propose a novel approach named RAGSum with the aim of both effectiveness and efficiency in recommendations. RAGSum is built on top of fuse retrieval and generation using a single CodeT5 backbone. We report preliminary results on a unified retrieval-generation framework built on CodeT5. A contrastive pre-training phase shapes code embeddings for nearest-neighbor search; these weights then seed end-to-end training with a composite loss that (i) rewards accurate top-k retrieval; and (ii) minimizes comment generation error. More importantly, a lightweight self-refinement loop is deployed to polish the final output. We evaluated the framework on three cross-language benchmarks (Java, Python, C) and compared it with three well-established baselines. The results show that our approach substantially outperforms the baselines with respect to the BLEU, METEOR, and ROUTE-L scores. These early findings indicate that tightly coupling retrieval and generation can raise the ceiling for comment automation and motivate forthcoming industrial replications and qualitative developer studies.
This program is tentative and subject to change.
Thu 2 OctDisplayed time zone: Hawaii change
13:50 - 14:50 | Program Comprehension and Review 1ESEM - Industry, Government, and Community Track / ESEM - Emerging Results and Vision Track / ESEM - Technical Track at Kaiulani II | ||
13:50 15mTalk | When Retriever Meets Generator: A Joint Model for Code Comment Generation ESEM - Emerging Results and Vision Track Tien L. T. Pham Hanoi University of Science and Technology, Anh M. T. Bui Hanoi University of Science and Technology, Huy N. D. Pham AI Young Talent Academy (AI4Life), Hanoi University of Science and Technology, Alessio Bucaioni Malardalen University, Phuong T. Nguyen University of L’Aquila Pre-print | ||
14:05 15mTalk | From Assessment to Enhancement of Pull Requests at Scale: Aligning Code Reviews with Developer Competencies Using Large Language Models ESEM - Industry, Government, and Community Track Luca Mariotto Hasso-Plattner Institute, Christian Medeiros Adriano Hasso Plattner Institute, University of Potsdam, René Eichhorn Mercedes-Benz Tech Innovation, Daniel Burgstahler Mercedes-Benz Tech Innovation, Holger Giese Hasso Plattner Institute, University of Potsdam | ||
14:20 15mTalk | Rethinking Code Review Workflows with LLM Assistance: An Empirical Study ESEM - Industry, Government, and Community Track Fannar Steinn Aðalsteinsson WirelessCar Sweden AB & Chalmers University of Technology, Björn Borgar Magnússon WirelessCar Sweden AB, Mislav Milicevic WirelessCar Sweden AB, Adam Nirving Davidsson WirelessCar Sweden AB, Chih-Hong Cheng Carl von Ossietzky Universität Oldenburg & Chalmers University of Technology | ||
14:35 15mTalk | Interrogative Comments Posed by Review Comment Generators: An Empirical Study of Gerrit ESEM - Technical Track Farshad Kazemi University of Waterloo, Maxime Lamothe Polytechnique Montreal, Shane McIntosh University of Waterloo Pre-print |