RepoMind: Enhancing Repository-Level Code Generation via LLM Reasoning over Structured Repository Documentation
This program is tentative and subject to change.
Repository-level code generation aims to generate target code conditioned on the context of the specified repository. Existing approaches generally adopt the Retrieval-Augmented Generation (RAG) framework to retrieve relevant contextual information from the repository, thereby mitigating the challenge of long context windows. However, these methods typically struggle to identify truly relevant information for the current generation requirement, and they generally overlook the imperative need for understanding the repository-level code structure during the retrieval process. To address these issues, we propose RepoMind, a novel framework that advances repository-level code generation by leveraging a repository documentation library and LLM reasoning capabilities for API retrieval. Specifically, RepoMind first introduces the RepoDocs Agent. This component constructs multi-granularity, hierarchical structural documentation in a bottom-up manner, providing LLMs with a high-quality knowledge base for understanding repository functionality and structure. Built upon this hierarchical knowledge base, RepoMind further integrates the Reasoning-Retrieval Agent. This agent mimics human developer patterns by utilizing the repository documentation for layer-by-layer exploration, and ultimately precisely localizes the relevant API sets. Finally, the framework combines the functionally relevant API set retrieved via LLM reasoning with the semantically similar API set retrieved by the vector retriever, utilizing this hybrid context to boost code generation performance. Evaluation on widely-used repository-level code generation benchmarks, CoderEval and DevEval, demonstrates that RepoMind surpasses state-of-the-art methods, achieving up to a 13.50% relative improvement in Pass@1 scores.
This program is tentative and subject to change.
Mon 13 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
11:00 - 12:30 | Session 5 - Summarization, Documentation, and Code ReviewResearch Track / Vaclav Rajlich Early Career Award / ICPC Program / Journal First at Europa II Chair(s): Masud Rahman Dalhousie University | ||
11:00 10mTalk | Vaclav Rajlich Award Vaclav Rajlich Early Career Award Marvin Wyrich Saarland University | ||
11:10 10mTalk | RepoMind: Enhancing Repository-Level Code Generation via LLM Reasoning over Structured Repository Documentation Research Track Songwen Gong South China University of Technology, Mengzhen Wang South China University of Technology, Jiexin Wang South China University of Technology, Yi Cai School of Software Engineering, South China University of Technology, Guangzhou, China | ||
11:20 10mTalk | SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization Research Track Lei Yu Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Peng Wang Institute of Information Engineering,Chinese Academy of Sciences, Jingyuan Zhang Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Xin Wang Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Jia Xu Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Li Yang Institute of Software, Chinese Academy of Sciences, Changzhi Deng Institute of Software, Chinese Academy of Sciences, Jiajia Ma Institute of Software, Chinese Academy of Sciences, China, Fengjun Zhang Institute of Software, Chinese Academy of Sciences, China Pre-print Media Attached File Attached | ||
11:30 10mTalk | Studying Quality Improvements Recommended via Manual and Automated Code Review Research Track Giuseppe Crupi Università della Svizzera italiana, Rosalia Tufano Università della Svizzera Italiana, Gabriele Bavota Software Institute @ Università della Svizzera Italiana Pre-print | ||
11:40 10mTalk | Towards Universal Segmentation for Log Parsing Research Track Van-Hoang Le University of Luxembourg, Luxembourg, Domenico Bianculli University of Luxembourg, Huy-Trung Nguyen Posts and Telecommunications Institute of Technology Pre-print | ||
11:50 10mTalk | DPS: Design Pattern Summarisation Using Code Features Journal First Najam Nazar Monash University, Sameer Sikka University of Melbourne, Christoph Treude Singapore Management University | ||
12:00 10mTalk | On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study Research Track Antonio Vitale Politecnico di Torino, University of Molise, Emanuela Guglielmi University of Molise, Simone Scalabrino University of Molise, Rocco Oliveto University of Molise Pre-print | ||
12:10 20mLive Q&A | Joint QA and Discussion ICPC Program | ||