Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes
Understanding software vulnerabilities and their resolutions is crucial for securing modern software systems. This study presents a novel traceability model that links a pair of sentences describing at least one of the three types of semantics (triggers, crash phenomenon and fix action) for a vulnerability in natural language (NL) vulnerability artifacts, to their corresponding pair of code statements. Different from the traditional traceability models, our trace links between a pair of related NL sentences and a pair of code statements can recover the semantic relationship between code statements so that the specific role played by each code statement in a vulnerability can be automatically identified. Our end-to-end approach is implemented in two key steps: VulnExtract and VulnTrace. VulnExtract automatically extracts sentences describing triggers, crash phenomenon and/or fix action for a vulnerability using 37 discourse patterns derived from NL artifacts (CVE summary, bug reports and commit messages). VulnTrace employs pre-trained code search models to trace these sentences to corresponding code statements. Our empirical study, based on 341 CVEs and their associated code snippets, demonstrates the effectiveness of our approach, with recall exceeding 90% in most cases for NL sentence extraction. VulnTrace achieves a Top5 accuracy of over 68.2% for mapping a pair of related NL sentences to corresponding pair of code statements. The end-to-end combined VulnExtract+VulnTrace achieves a Top5 accuracy of 59.6% and 53.1% for mapping two pairs of NL sentences to code statements. These results highlight the potential of our method in automating vulnerability comprehension and reducing manual effort.
Mon 23 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:30 | Vulnerability 1Research Papers / Ideas, Visions and Reflections / Journal First at Cosmos 3C Chair(s): Cuiyun Gao Harbin Institute of Technology, Shenzhen | ||
10:30 20mTalk | VulPA: Detecting Semantically Recurring Vulnerabilities with Multi-Object Typestate Analysis Research Papers Liqing Cao Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Haofeng Li SKLP, Institute of Computing Technology, CAS, Chenghang Shi SKLP, Institute of Computing Technology, CAS, Jie Lu SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Haining Meng SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Lian Li Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jingling Xue University of New South Wales DOI | ||
10:50 20mTalk | Mystique: Automated Vulnerability Patch Porting with Semantic and Syntactic-Enhanced LLM Research Papers Susheng Wu Fudan University, Ruisi Wang Fudan University, Bihuan Chen Fudan University, Zhuotong Zhou Fudan University, Yiheng Huang Fudan University, JunPeng Zhao Fudan University, Xin Peng Fudan University DOI | ||
11:10 20mTalk | Identifying Affected Third-Party Java Libraries from Textual Descriptions of Vulnerabilities and Libraries Journal First Tianyu Chen Microsoft Research Asia, Lin Li Huawei Cloud Computing Technologies Co., Ltd., Bingjie Shan Huawei Cloud Computing Technologies Co., Ltd., Guangtai Liang Huawei Cloud Computing Technologies, Ding Li Peking University, Qianxiang Wang Huawei Technologies Co., Ltd, Tao Xie Peking University | ||
11:30 20mTalk | Code Change Intention, Development Artifact and History Vulnerability: Putting Them Together for Vulnerability Fix Detection by LLM Research Papers Xu Yang University of Manitoba, Wenhan Zhu Huawei Canada, Michael Pacheco Centre for Software Excellence, Huawei, Jiayuan Zhou Huawei, Shaowei Wang University of Manitoba, Xing Hu Zhejiang University, Kui Liu Huawei DOI | ||
11:50 10mTalk | Augmenting Software Bills of Materials with Software Vulnerability Description Ideas, Visions and Reflections Davide Fucci Blekinge Institute of Technology, Massimiliano Di Penta University of Sannio, Italy, Simone Romano University of Salerno, Giuseppe Scanniello University of Salerno | ||
12:00 20mTalk | Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes Research Papers Amiao Gao Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, Zenong Zhang The University of Texas - Dallas, Simin Wang Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, LiGuo Huang Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205, Shiyi Wei University of Texas at Dallas, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 DOI | ||
12:20 10mTalk | Emerging Results in Using Explainable AI to Improve Software Vulnerability Prediction Ideas, Visions and Reflections Fahad Al Debeyan Lancaster University, Tracy Hall Lancaster University, Lech Madeyski Wroclaw University of Science and Technology |
Cosmos 3C is the third room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.