LineVD: Statement-level Vulnerability Detection using Graph Neural Networks (MSR 2022 - Technical Papers)

Who

David Hin, Andrey Kan, Huaming Chen, Muhammad Ali Babar

Track

MSR 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 19 May 2022 21:07 - 21:14 at MSR Main room - odd hours - Session 13: Security & Quality Chair(s): Gias Uddin

Abstract

Current machine-learning based software vulnerability detection methods are primarily conducted at the function-level. However, a key limitation of these methods is that they do not indicate the specific lines of code contributing to vulnerabilities. This limits the ability of developers to efficiently inspect and interpret the predictions from a learnt model, which is crucial for integrating machine-learning based tools into the software development workflow. Graph-based models have shown promising performance in function-level vulnerability detection, but their capability for statement-level vulnerability detection has not been extensively explored. While interpreting function-level predictions through explainable AI is one promising direction, we herein consider the statement-level software vulnerability detection task from a fully supervised learning perspective. We propose a novel deep learning framework, LineVD, which formulates statement-level vulnerability detection as a node classification task. LineVD leverages control and data dependencies between statements using graph neural networks, and a transformer-based model to encode the raw source code tokens. In particular, by addressing the conflicting outputs between function-level and statement-level information, LineVD significantly improve the prediction performance without vulnerability status for function code. We have conducted extensive experiments against a large-scale collection of real-world C/C++ vulnerabilities obtained from multiple real-world projects, and demonstrate an increase of 105% in F1-score over the current state-of-the-art.

David Hin

The University of Adelaide

Andrey Kan

The University of Adelaide

Huaming Chen

The University of Adelaide

Muhammad Ali Babar

University of Adelaide

Australia

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 19 May
Displayed time zone: Eastern Time (US & Canada) change

21:00 - 21:50	Session 13: Security & QualityTechnical Papers / Data and Tool Showcase Track / Registered Reports / Industry Track at MSR Main room - odd hours Chair(s): Gias Uddin University of Calgary, Canada

21:00 7m Talk		On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models Technical Papers Triet Le The University of Adelaide, Muhammad Ali Babar University of Adelaide Pre-print
21:07 7m Talk		LineVD: Statement-level Vulnerability Detection using Graph Neural Networks Technical Papers David Hin The University of Adelaide, Andrey Kan The University of Adelaide, Huaming Chen The University of Adelaide, Muhammad Ali Babar University of Adelaide
21:14 7m Talk		LineVul: A Transformer-based Line-Level Vulnerability Prediction Technical Papers Michael Fu Monash University, Kla Tantithamthavorn Monash University Pre-print
21:21 4m Talk		ECench: An Energy Bug Benchmark of Ethereum Client Software Data and Tool Showcase Track Jinyoung Kim Sungkyunkwan University, Misoo Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University
21:25 7m Talk		Microsoft CloudMine: Data Mining for the Executive Order on Improving the Nation’s Cybersecurity Industry Track Kim Herzig Tools for Software Engineers, Microsoft, Luke Gostling Microsoft Corporation, Maximilian Grothusmann Microsoft Corporation, Nora Huang Microsoft Corporation, Sascha Just Microsoft, Alan Klimowski Microsoft Corporation, Yashasvini Ramkumar Microsoft Corporation, Myles McLeroy Microsoft Corporation, Kıvanç Muşlu Microsoft, Hitesh Sajnani Microsoft , Varsha Vadaga Microsoft Corporation
21:32 4m Talk		Evaluating few shot and Contrastive learning Methods for Code Clone Detection Registered Reports Mohamad Khajezade University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Mohamed S Shehata University of British Columbia Pre-print
21:36 14m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Thu 19 May 2022 21:00 - 21:50 at MSR Main room - odd hours - Session 13: Security & Quality Chair(s): Gias Uddin

Info for room MSR Main room - odd hours:

Click here to go to the room on Midspace