An Empirical Study of Deep Learning Models for Vulnerability Detection
Fri 19 May 2023 13:45 - 14:00 at Meeting Room 106 - Vulnerability detection Chair(s): Cuiyun Gao
Deep learning (DL) models of code have recently reported great progress for vulnerability detection. In some cases, DL-based models have outperformed static analysis tools. Although many great models have been proposed, we do not yet have a good understanding of these models. This limits the further advancement of model robustness, debugging, and deployment for the vulnerability detection. In this paper, we surveyed and reproduced 9 state-of-the-art (SOTA) deep learning models on 2 widely used vulnerability detection datasets: Devign and MSR. We investigated 6 research questions in three areas, namely model capabilities, training data, and model interpretation. We experimentally demonstrated the variability between different runs of a model and the low agreement among different models’ outputs. We investigated models trained for specific types of vulnerabilities compared to a model that is trained on all the vulnerabilities at once. We explored the types of programs DL may consider “hard” to handle. We investigated the relations of training data sizes and training data composition with model performance. Finally, we studied model interpretations and analyzed important features that the models used to make predictions. We believe that our findings can help better understand model results, provide guidance on preparing training data, and improve the robustness of the models. All of our datasets, code, and results are available at https://figshare.com/s/284abfba67dba448fdc2.
Wed 17 MayDisplayed time zone: Hobart change
Fri 19 MayDisplayed time zone: Hobart change
13:45 - 15:15 | Vulnerability detectionTechnical Track / Journal-First Papers at Meeting Room 106 Chair(s): Cuiyun Gao Harbin Institute of Technology | ||
13:45 15mTalk | An Empirical Study of Deep Learning Models for Vulnerability Detection Technical Track Benjamin Steenhoek Iowa State University, Md Mahbubur Rahman Iowa State University, Richard Jiles Iowa State University, Wei Le Iowa State University Pre-print | ||
14:00 15mTalk | DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detection Technical Track Wenbo Wang New Jersey Institute of Technology, Tien N. Nguyen University of Texas at Dallas, Shaohua Wang New Jersey Institute of Technology, Yi Li New Jersey Institute of Technology, Jiyuan Zhang University of Illinois Urbana-Champaign, Aashish Yadavally The University of Texas at Dallas Pre-print | ||
14:15 15mTalk | Enhancing Deep Learning-based Vulnerability Detection by Building Behavior Graph Model Technical Track Bin Yuan Huazhong University of Science and Technology, Yifan Lu Huazhong University of Science and Technology, Yilin Fang Huazhong University of Science and Technology, Yueming Wu Nanyang Technological University, Deqing Zou Huazhong University of Science and Technology, Zhen Li Huazhong University of Science and Technology, Zhi Li Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology | ||
14:30 15mTalk | Vulnerability Detection with Graph Simplification and Enhanced Graph Representation Learning Technical Track Xin-Cheng Wen Harbin Institute of Technology, Yupan Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Hongyu Zhang The University of Newcastle, Jie M. Zhang King's College London, Qing Liao Harbin Institute of Technology | ||
14:45 15mTalk | Does data sampling improve deep learning-based vulnerability detection? Yeas! and Nays! Technical Track Xu Yang University of Manitoba, Shaowei Wang University of Manitoba, Yi Li New Jersey Institute of Technology, Shaohua Wang New Jersey Institute of Technology Pre-print | ||
15:00 7mTalk | Learning from What We Know: How to Perform Vulnerability Prediction using Noisy Historical Data Journal-First Papers Aayush Garg University of Luxembourg, Luxembourg, Renzo Degiovanni SnT, University of Luxembourg, Matthieu Jimenez SnT, University of Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg Link to publication DOI Authorizer link Pre-print Media Attached | ||
15:07 7mTalk | Do I really need all this work to find vulnerabilities? An empirical case study comparing vulnerability detection techniques on a Java application Journal-First Papers Sarah Elder North Carolina State University, Nusrat Zahan North Carolina State University, Rui Shu North Carolina State University, Valeri Kozarev North Carolina State University, Tim Menzies North Carolina State University, Laurie Williams North Carolina State University |