Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes (FSE 2025 - Research Papers)

Who

Amiao Gao, Zenong Zhang, Simin Wang, LiGuo Huang, Shiyi Wei, Vincent Ng

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 23 Jun 2025 12:00 - 12:20 at Cosmos 3C - Vulnerability 1 Chair(s): Cuiyun Gao

Abstract

Understanding software vulnerabilities and their resolutions is crucial for securing modern software systems. This study presents a novel traceability model that links a pair of sentences describing at least one of the three types of semantics (triggers, crash phenomenon and fix action) for a vulnerability in natural language (NL) vulnerability artifacts, to their corresponding pair of code statements. Different from the traditional traceability models, our trace links between a pair of related NL sentences and a pair of code statements can recover the semantic relationship between code statements so that the specific role played by each code statement in a vulnerability can be automatically identified. Our end-to-end approach is implemented in two key steps: VulnExtract and VulnTrace. VulnExtract automatically extracts sentences describing triggers, crash phenomenon and/or fix action for a vulnerability using 37 discourse patterns derived from NL artifacts (CVE summary, bug reports and commit messages). VulnTrace employs pre-trained code search models to trace these sentences to corresponding code statements. Our empirical study, based on 341 CVEs and their associated code snippets, demonstrates the effectiveness of our approach, with recall exceeding 90% in most cases for NL sentence extraction. VulnTrace achieves a Top5 accuracy of over 68.2% for mapping a pair of related NL sentences to corresponding pair of code statements. The end-to-end combined VulnExtract+VulnTrace achieves a Top5 accuracy of 59.6% and 53.1% for mapping two pairs of NL sentences to code statements. These results highlight the potential of our method in automating vulnerability comprehension and reducing manual effort.

DOI

https://doi.org/10.1145/3729360

Amiao Gao

Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122

Zenong Zhang

The University of Texas - Dallas

Simin Wang

Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122

LiGuo Huang

Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205

United States

Shiyi Wei

University of Texas at Dallas

United States

Vincent Ng

Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 23 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:30 - 12:30	Vulnerability 1Research Papers / Ideas, Visions and Reflections / Journal First at Cosmos 3C Chair(s): Cuiyun Gao Harbin Institute of Technology, Shenzhen

10:30 20m Talk		VulPA: Detecting Semantically Recurring Vulnerabilities with Multi-Object Typestate Analysis Research Papers Liqing Cao Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Haofeng Li SKLP, Institute of Computing Technology, CAS, Chenghang Shi SKLP, Institute of Computing Technology, CAS, Jie Lu SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Haining Meng SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Lian Li Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jingling Xue University of New South Wales DOI
10:50 20m Talk		Mystique: Automated Vulnerability Patch Porting with Semantic and Syntactic-Enhanced LLM Research Papers Susheng Wu Fudan University, Ruisi Wang Fudan University, Bihuan Chen Fudan University, Zhuotong Zhou Fudan University, Yiheng Huang Fudan University, JunPeng Zhao Fudan University, Xin Peng Fudan University DOI
11:10 20m Talk		Identifying Affected Third-Party Java Libraries from Textual Descriptions of Vulnerabilities and Libraries Journal First Tianyu Chen Microsoft Research Asia, Lin Li Huawei Cloud Computing Technologies Co., Ltd., Bingjie Shan Huawei Cloud Computing Technologies Co., Ltd., Guangtai Liang Huawei Cloud Computing Technologies, Ding Li Peking University, Qianxiang Wang Huawei Technologies Co., Ltd, Tao Xie Peking University
11:30 20m Talk		Code Change Intention, Development Artifact and History Vulnerability: Putting Them Together for Vulnerability Fix Detection by LLM Research Papers Xu Yang University of Manitoba, Wenhan Zhu Huawei Canada, Michael Pacheco Centre for Software Excellence, Huawei, Jiayuan Zhou Huawei, Shaowei Wang University of Manitoba, Xing Hu Zhejiang University, Kui Liu Huawei DOI
11:50 10m Talk		Augmenting Software Bills of Materials with Software Vulnerability Description Ideas, Visions and Reflections Davide Fucci Blekinge Institute of Technology, Massimiliano Di Penta University of Sannio, Italy, Simone Romano University of Salerno, Giuseppe Scanniello University of Salerno
12:00 20m Talk		Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes Research Papers Amiao Gao Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, Zenong Zhang The University of Texas - Dallas, Simin Wang Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, LiGuo Huang Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205, Shiyi Wei University of Texas at Dallas, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 DOI
12:20 10m Talk		Emerging Results in Using Explainable AI to Improve Software Vulnerability Prediction Ideas, Visions and Reflections Fahad Al Debeyan Lancaster University, Tracy Hall Lancaster University, Lech Madeyski Wroclaw University of Science and Technology

Information for Participants

Mon 23 Jun 2025 10:30 - 12:30 at Cosmos 3C - Vulnerability 1 Chair(s): Cuiyun Gao

Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.