Emerging Results in Using Explainable AI to Improve Software Vulnerability Prediction
Explainable Artificial Intelligence (XAI) has recently been applied to vulnerability prediction models to understand the decisions made and to improve the transparency of those models. We are the first to leverage XAI explanations to improve vulnerability prediction performance. The performance of vulnerability prediction models relies on the quality of the vulnerability dataset and the machine learning model. We use XAI information to identify biases in vulnerability prediction datasets and limitations in deep learning-based prediction models. Our XAI analysis is based on using a state-of-the-art deep-learning vulnerability prediction model (LineVul) and an explainability algorithm (Layered Integrated Gradients) to generate XAI information. The XAI information that we generated allowed us to improve our understanding of how our models worked such that we were able to identify important improvement opportunities. Consequently, we present some surprising findings: while LineVul accurately predicted vulnerable functions, in 43% of cases, the use of XAI data allowed us to identify that those predictions were based on dataset biases rather than on actual vulnerable lines. By systematically removing these dataset biases, we achieved a notable performance improvement, increasing LineVul’s F-Measure from 92% to 96%. Additionally, the insight we gained from XAI allowed us to also identify a fundamental limitation in LineVul’s reliance on CodeBERT, a pre-trained language model limited to 512 tokens. By integrating LongCoder, a pre-trained model capable of processing longer sequences, we achieved an F-measure and MCC increase from 92% and 91%, respectively, to 94%, highlighting the potential for improved handling of complex, long-sequence vulnerabilities. We conclude that XAI has important additional applications that go beyond providing users with information describing the basis of predictions.
Mon 23 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:30 | Vulnerability 1Research Papers / Ideas, Visions and Reflections / Journal First at Cosmos 3C Chair(s): Cuiyun Gao Harbin Institute of Technology, Shenzhen | ||
10:30 20mTalk | VulPA: Detecting Semantically Recurring Vulnerabilities with Multi-Object Typestate Analysis Research Papers Liqing Cao Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Haofeng Li SKLP, Institute of Computing Technology, CAS, Chenghang Shi SKLP, Institute of Computing Technology, CAS, Jie Lu SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Haining Meng SKLP, Institute of Computing Technology, CAS, China; University of Chinese Academy of Sciences, China, Lian Li Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jingling Xue University of New South Wales DOI | ||
10:50 20mTalk | Mystique: Automated Vulnerability Patch Porting with Semantic and Syntactic-Enhanced LLM Research Papers Susheng Wu Fudan University, Ruisi Wang Fudan University, Bihuan Chen Fudan University, Zhuotong Zhou Fudan University, Yiheng Huang Fudan University, JunPeng Zhao Fudan University, Xin Peng Fudan University DOI | ||
11:10 20mTalk | Identifying Affected Third-Party Java Libraries from Textual Descriptions of Vulnerabilities and Libraries Journal First Tianyu Chen Microsoft Research Asia, Lin Li Huawei Cloud Computing Technologies Co., Ltd., Bingjie Shan Huawei Cloud Computing Technologies Co., Ltd., Guangtai Liang Huawei Cloud Computing Technologies, Ding Li Peking University, Qianxiang Wang Huawei Technologies Co., Ltd, Tao Xie Peking University | ||
11:30 20mTalk | Code Change Intention, Development Artifact and History Vulnerability: Putting Them Together for Vulnerability Fix Detection by LLM Research Papers Xu Yang University of Manitoba, Wenhan Zhu Huawei Canada, Michael Pacheco Centre for Software Excellence, Huawei, Jiayuan Zhou Huawei, Shaowei Wang University of Manitoba, Xing Hu Zhejiang University, Kui Liu Huawei DOI | ||
11:50 10mTalk | Augmenting Software Bills of Materials with Software Vulnerability Description Ideas, Visions and Reflections Davide Fucci Blekinge Institute of Technology, Massimiliano Di Penta University of Sannio, Italy, Simone Romano University of Salerno, Giuseppe Scanniello University of Salerno | ||
12:00 20mTalk | Teaching AI the ‘Why’ and ‘How’ of Software Vulnerability Fixes Research Papers Amiao Gao Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, Zenong Zhang The University of Texas - Dallas, Simin Wang Department of Computer Science, Southern Methodist University, Dallas, Texas, USA 75275-0122, LiGuo Huang Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205, Shiyi Wei University of Texas at Dallas, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 DOI | ||
12:20 10mTalk | Emerging Results in Using Explainable AI to Improve Software Vulnerability Prediction Ideas, Visions and Reflections Fahad Al Debeyan Lancaster University, Tracy Hall Lancaster University, Lech Madeyski Wroclaw University of Science and Technology |
Cosmos 3C is the third room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.