APSEC 2024
Tue 3 - Fri 6 December 2024 China
Fri 6 Dec 2024 09:30 - 10:00 at Room 1 (Zunhui Room) - Session (16) Chair(s): Haoye Tian

In the field of software engineering automation, code language models have made significant strides in code generation tasks. However, due to the cost of updating knowledge and the issue of hallucinations, code language models (CLMs) face challenges in practical code generation scenarios, making retrieval-augmented code generation a mainstream approach. Existing retrieval-augmented methods only build codebases for a single programming language, which is insufficient to address the lack of monolingual knowledge. To address this, we propose CodeRCSG, a novel cross-lingual retrieval-augmented code generation method. This method constructs a multilingual codebase and creates a unified cross-lingual code semantic graph to capture deep semantic information across different programming languages. By encoding the retrieved code semantic graph with GNN and combining it with input text embeddings, code language models can effectively utilize the transferred cross-lingual programming knowledge to improve the quality of generated code. Experimental results show that CodeRCSG can significantly enhance the code generation capabilities of code language models.

Fri 6 Dec

Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

09:30 - 10:30
Session (16)Technical Track at Room 1 (Zunhui Room)
Chair(s): Haoye Tian University of Melbourne
09:30
30m
Talk
Enhancing Code Generation through Retrieval of Cross-Lingual Semantic Graphs
Technical Track
Zhijie Jiang National University of Defense Technology, Zejian Shi Fudan University, Xinyu Gao , Yun Xiong Fudan University
10:00
30m
Talk
Optimizing LLMs for Code Generation: Which Hyperparameter Settings Yield the Best Results?
Technical Track
Chetan Arora Monash University, Ahnaf Ibn Sayeed Monash University, Sherlock A. Licorish University of Otago, Fanyu Wang Monash University, Christoph Treude Singapore Management University