Code Change Intention, Development Artifact and History Vulnerability: Putting Them Together for Vulnerability Fix Detection by LLM (FSE 2025 - Research Papers)

Who

Xu Yang, Wenhan Zhu, Michael Pacheco, Jiayuan Zhou, Shaowei Wang, Xing Hu, Kui Liu

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 23 Jun 2025 11:30 - 11:50 at Cosmos 3C - Vulnerability 1 Chair(s): Cuiyun Gao

Abstract

Detecting vulnerability fix commits in open-source software is crucial for maintaining software security. To help OSS identify vulnerability fix commits, several automated approaches are developed. However, existing approaches like VulFixMiner and CoLeFunDa, focus solely on code changes, neglecting essential context from development artifacts. Tools like Vulcurator, which integrates issue reports, fail to leverage semantic associations between different development artifacts (e.g., pull requests and history vulnerability fixes). Moreover, they miss vulnerability fixes in tangled commits and lack explanations, limiting practical use. Hence to address those limitations, we propose LLM4VFD, a novel framework that leverages Large Language Models (LLMs) enhanced with Chain-of-Thought reasoning and In-Context Learning to improve the accuracy of vulnerability fix detection. LLM4VFD comprises three components: (1) Code Change Intention, which analyzes commit summaries, purposes, and implications using Chain-of-Thought reasoning; (2) Development Artifact, which incorporates context from related issue reports and pull requests; (3) Historical Vulnerability, which retrieves similar past vulnerability fixes to enrich context. More importantly, on top of the prediction, LLM4VFD also provides a detailed analysis and explanation to help security experts understand the rationale behind the decision. We evaluated LLM4VFD against state-of-the-art techniques, including Pre-trained Language Model-based approaches and vanilla LLMs, using a newly collected dataset, BigVulFixes. Experimental results demonstrate that LLM4VFD significantly outperforms the best-performed existing approach by 68.1%–145.4%. Furthermore, We conducted a user study with security experts, showing that the analysis generated by LLM4VFD improves the efficiency of vulnerability fix identification.

DOI

https://doi.org/10.1145/3715738

Xu Yang

University of Manitoba

Wenhan Zhu

Huawei Canada

Michael Pacheco

Centre for Software Excellence, Huawei

Jiayuan Zhou

Huawei

Canada

Shaowei Wang

University of Manitoba

Canada

Xing Hu

Zhejiang University

China

Kui Liu

Huawei

China

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 23 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:30 - 12:30	Vulnerability 1Research Papers / Ideas, Visions and Reflections / Journal First at Cosmos 3C Chair(s): Cuiyun Gao Harbin Institute of Technology, Shenzhen

10:30 20m Talk		VulPA: Detecting Semantically Recurring Vulnerabilities with Multi-Object Typestate Analysis Research Papers Liqing Cao Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Haofeng Li SKLP, Institute of Computing Technology, CAS, Chenghang Shi SKLP, Institute of Computing Technology, CAS, Jie Lu SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Haining Meng SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Lian Li Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jingling Xue University of New South Wales DOI
10:50 20m Talk		Mystique: Automated Vulnerability Patch Porting with Semantic and Syntactic-Enhanced LLM Research Papers Susheng Wu Fudan University, Ruisi Wang Fudan University, Bihuan Chen Fudan University, Zhuotong Zhou Fudan University, Yiheng Huang Fudan University, JunPeng Zhao Fudan University, Xin Peng Fudan University DOI
11:10 20m Talk		Identifying Affected Third-Party Java Libraries from Textual Descriptions of Vulnerabilities and Libraries Journal First Tianyu Chen Microsoft Research Asia, Lin Li Huawei Cloud Computing Technologies Co., Ltd., Bingjie Shan Huawei Cloud Computing Technologies Co., Ltd., Guangtai Liang Huawei Cloud Computing Technologies, Ding Li Peking University, Qianxiang Wang Huawei Technologies Co., Ltd, Tao Xie Peking University
11:30 20m Talk		Code Change Intention, Development Artifact and History Vulnerability: Putting Them Together for Vulnerability Fix Detection by LLM Research Papers Xu Yang University of Manitoba, Wenhan Zhu Huawei Canada, Michael Pacheco Centre for Software Excellence, Huawei, Jiayuan Zhou Huawei, Shaowei Wang University of Manitoba, Xing Hu Zhejiang University, Kui Liu Huawei DOI
11:50 10m Talk		Augmenting Software Bills of Materials with Software Vulnerability Description Ideas, Visions and Reflections Davide Fucci Blekinge Institute of Technology, Massimiliano Di Penta University of Sannio, Italy, Simone Romano University of Salerno, Giuseppe Scanniello University of Salerno
12:00 20m Talk		Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes Research Papers Amiao Gao Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, Zenong Zhang The University of Texas - Dallas, Simin Wang Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, LiGuo Huang Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205, Shiyi Wei University of Texas at Dallas, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 DOI
12:20 10m Talk		Emerging Results in Using Explainable AI to Improve Software Vulnerability Prediction Ideas, Visions and Reflections Fahad Al Debeyan Lancaster University, Tracy Hall Lancaster University, Lech Madeyski Wroclaw University of Science and Technology

Information for Participants

Mon 23 Jun 2025 10:30 - 12:30 at Cosmos 3C - Vulnerability 1 Chair(s): Cuiyun Gao

Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.