LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
Code generation aims to automatically generate code from input requirements, significantly enhancing development efficiency. Recent large language models (LLMs) based approaches have shown promising results and revolutionized code generation task. Despite the promising performance, LLMs often generate contents with hallucinations, especially for the code generation scenario requiring the handling of complex contextual dependencies in practical development process. Although previous study has analyzed hallucinations in LLM-powered code generation, the study is limited to standalone function generation. In this paper, we conduct an empirical study to study the phenomena, mechanism, and mitigation of LLM hallucinations within more practical and complex development contexts in repository-level generation scenario. First, we manually examine the code generation results from six mainstream LLMs to establish a hallucination taxonomy of LLM- generated code. Next, we elaborate on the phenomenon of hallucinations, analyze their distribution across different models. We then analyze causes of hallucinations and identify four potential factors contributing to hallucinations. Finally, we propose an RAG-based mitigation method, which demonstrates consistent effectiveness in all studied LLMs. The replication package including code, data, and experimental results is anonymously available at https://anonymous.4open.science/r/LLMCodingHallucination/.
Wed 25 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:30 | Program RepairTool Demonstrations / Research Papers at Aurora A Chair(s): Yannic Noller Ruhr University Bochum | ||
11:00 25mTalk | LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation Research Papers Ziyao Zhang Sun Yat-sen University, Chong Wang Nanyang Technological University, Yanlin Wang Sun Yat-sen University, Ensheng Shi Xi’an Jiaotong University, Yuchi Ma Huawei Cloud Computing Technologies, Wanjun Zhong Sun Yat-sen University, Jiachi Chen Sun Yat-sen University, Mingzhi Mao Sun Yat-sen University, Zibin Zheng Sun Yat-sen University DOI | ||
11:25 25mTalk | AdverIntent-Agent: Adversarial Reasoning for Repair Based on Inferred Program Intent Research Papers He Ye University College London (UCL), Aidan Z.H. Yang Carnegie Mellon University, Chang Hu Macau University of Science and Technology, Yanlin Wang Sun Yat-sen University, Tao Zhang Macau University of Science and Technology, Claire Le Goues Carnegie Mellon University DOI | ||
11:50 25mTalk | PatchScope: LLM-Enhanced Fine-Grained Stable Patch Classification for Linux Kernel Research Papers Rongkai Liu Central South University, Heyuan Shi Central South University, Shuning Liu Central South University, China, Chao Hu Central South University, Sisheng Li Central South University, China, Yuheng Shen Tsinghua University, Runzhe Wang Alibaba Group, Xiaohai Shi Alibaba Group, Yu Jiang Tsinghua University DOI | ||
12:15 15mDemonstration | InfraFix: Technology-Agnostic Repair of Infrastructure as Code Tool Demonstrations Nuno Saavedra INESC-ID and IST, University of Lisbon, João F. Ferreira INESC-ID and IST, University of Lisbon, Alexandra Mendes Faculty of Engineering, University of Porto & INESC TEC |
Aurora A is the first room in the Aurora wing.
When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.