A Knowledge Enhanced Large Language Model for Bug Localization (FSE 2025 - Research Papers)

Who

Yue Li, Bohan Liu, Ting Zhang, Zhiqi Wang, David Lo, Lanxin Yang, Jun Lyu, He Zhang

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 23 Jun 2025 14:10 - 14:30 at Cosmos Hall - LLM for SE 1 Chair(s): Chao Peng

Abstract

A significant number of bug reports are generated every day as software systems continue to develop. Large Language Models (LLMs) have been used to correlate bug reports with source code to locate bugs automatically. The existing research has shown that LLMs are effective for bug localization and can increase software development efficiency. However, these studies still have two weaknesses. First, these models fail to capture context information about bug reports and source code. Second, these models are unable to understand the domain-specific expertise inherent to particular projects, such as version information in projects that are composed of alphanumeric characters without any semantic meaning.

To address these challenges, we propose a Knowledge Enhanced Pre-Trained model using project documents and historical code, called KEPT, for bug localization. Project documents record, revise, and restate project information that provides rich semantic information about those projects. Historical code contains rich code semantic information that can enhance the reasoning ability of LLMs. Specifically, we construct knowledge graphs from project documents and source code. Then, we introduce knowledge graphs to the LLM through soft-position embedding and visible matrices, enhancing its contextual and professional reasoning ability. To validate our model, we conducted a series of experiments on seven open-source software projects with over 6,000 bug reports. Compared with the traditional model (ie Locus), \ourapproach performs better by 33.2% to 59.5% in terms of mean reciprocal rank, mean average precision, and Top@N. Compared with the best-performing LLM (ie CodeT5) \ourapproach achieves an improvement of 36.6% to 63.7%. The results indicate that introducing knowledge graphs can enhance the effectiveness of the LLM in bug localization.

DOI

https://doi.org/10.1145/3729356

Yue Li

Nanjing University

China

Bohan Liu

Nanjing University

Ting Zhang

Singapore Management University

Singapore

Zhiqi Wang

Nanjing University

David Lo

Singapore Management University

Singapore

Lanxin Yang

Nanjing University

China

Jun Lyu

Nanjing University

China

He Zhang

Nanjing University

China

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 23 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:20	LLM for SE 1Ideas, Visions and Reflections / Research Papers / Industry Papers / Demonstrations at Cosmos Hall Chair(s): Chao Peng ByteDance

14:00 10m Talk		Teamwork makes the dream work: LLMs-Based Agents for GitHub README.MD Summarization Ideas, Visions and Reflections Duc S. H. Nguyen Hanoi University of Science and Technology, Bach G. Truong Hanoi University of Science and Technology, Phuong T. Nguyen University of L’Aquila, Juri Di Rocco University of L'Aquila, Davide Di Ruscio University of L'Aquila Pre-print
14:10 20m Talk		A Knowledge Enhanced Large Language Model for Bug Localization Research Papers Yue Li Nanjing University, Bohan Liu Nanjing University, Ting Zhang Singapore Management University, Zhiqi Wang Nanjing University, David Lo Singapore Management University, Lanxin Yang Nanjing University, Jun Lyu Nanjing University, He Zhang Nanjing University DOI
14:30 10m Talk		A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models Demonstrations Changshu Liu University of Illinois at Urbana-Champaign, Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign Pre-print Media Attached
14:40 20m Talk		TickIt: Leveraging Large Language Models for Automated Ticket Escalation Industry Papers Fengrui Liu ByteDance, Xiao He Bytedance, Tieying Zhang ByteDance, Jianjun Chen Bytedance, Yi Li Nanyang Technological University, Lihua Yi Bytedance, Haipeng Zhang Bytedance, Gang Wu Bytedance, Rui Shi Bytedance
15:00 20m Talk		Natural Language Outlines for Code: Literate Programming in the LLM Era Industry Papers Kensen Shi Google DeepMind, Deniz Altinbuken Google, Saswat Anand Google, Mihai Christodorescu Google, Katja Grünwedel Google, Alexa Koenings Google, Sai Naidu Google, Anurag Pathak Google, Marc Rasi Google, Fredde Ribeiro Google, Brandon Ruffin Google, Siddhant Sanyam Google, Maxim Tabachnyk Google, Sara Toth Google, Roy Tu Google, Tobias Welp Google, Pengcheng Yin Google, Manzil Zaheer Google, Satish Chandra Google, Inc, Charles Sutton Google Research

Information for Participants

Mon 23 Jun 2025 14:00 - 15:20 at Cosmos Hall - LLM for SE 1 Chair(s): Chao Peng

Info for room Cosmos Hall:

This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.

The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.