Towards Causal Deep Learning for Vulnerability Detection (ICSE 2024 - Research Track)

Who

Md Mahbubur Rahman, Ira Ceka, Chengzhi Mao, Saikat Chakraborty, Baishakhi Ray, Wei Le

Track

ICSE 2024 Research Track

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Apr 2024 11:00 - 11:15 at Sophia de Mello Breyner Andresen - AI & Security 2 Chair(s): Gabriele Bavota

Abstract

Deep learning vulnerability detection has shown promising results in recent years. However, an important challenge that still blocks it from being very useful in practice is that the model is not robust under perturbation and it cannot generalize well over the out-of- distribution (OOD) data, e.g., applying a trained model to unseen projects in real world. We hypothesize that this is because the model learned non-robust features, e.g., variable names, that have spurious correlations with labels. When the perturbed and OOD datasets no longer have the same spurious features, the model prediction fails. To address the challenge, in this paper, we introduced causality into deep learning vulnerability detection. Our approach CasualVul consists of two phases. First, we designed novel perturbations to discover spurious features that the model may use to make predictions. Second, we applied the causal learning algorithms, specifically, do-calculus, on top of existing deep learning models to systemati- cally remove the use of spurious features and thus promote causal based prediction. Our results show that CasualVul consistently im- proved the model accuracy, robustness and OOD performance for all the state-of-the-art models and datasets we experimented. To the best of our knowledge, this is the first work that introduces do calculus based causal learning to software engineering models and shows it’s indeed useful for improving the model accuracy, robustness and generalization. Our replication package is located at https://figshare.com/s/0ffda320dcb96c249ef2.

Md Mahbubur Rahman

Iowa State University

Bangladesh

Ira Ceka

Columbia University

United States

Chengzhi Mao

Columbia University

Saikat Chakraborty

Microsoft Research

United States

Baishakhi Ray

AWS AI Labs

United States

Wei Le

Iowa State University

United States

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Apr
Displayed time zone: Lisbon change

11:00 - 12:30	AI & Security 2Research Track / New Ideas and Emerging Results at Sophia de Mello Breyner Andresen Chair(s): Gabriele Bavota Software Institute @ Università della Svizzera Italiana

11:00 15m Talk		Towards Causal Deep Learning for Vulnerability Detection Research Track Md Mahbubur Rahman Iowa State University, Ira Ceka Columbia University, Chengzhi Mao Columbia University, Saikat Chakraborty Microsoft Research, Baishakhi Ray AWS AI Labs, Wei Le Iowa State University
11:15 15m Talk		MetaLog: Generalizable Cross-System Anomaly Detection from Logs with Meta-Learning Research Track Chenyangguang Zhang Tsinghua University, Tong Jia Institute for Artificial Intelligence, Peking University, Beijing, China, Guopeng Shen Linkedsee Technology (China) Limited, Pinyan Zhu Linkedsee Technology (China) Limited, Ying Li School of Software and Microelectronics, Peking University, Beijing, China
11:30 15m Talk		Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection Systems Research Track Sicong Cao Yangzhou University, Xiaobing Sun Yangzhou University, Xiaoxue Wu Yangzhou University, David Lo Singapore Management University, Lili Bo Yangzhou University, Bin Li Yangzhou University, Wei Liu Nanjing University Media Attached File Attached
11:45 15m Talk		Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection Research Track Yizhou Chen Peking University, Zeyu Sun Institute of Software, Chinese Academy of Sciences, Zhihao Gong Peking University, Dan Hao Peking University
12:00 15m Talk		On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural Vulnerabilities Research Track Zhen Li Huazhong University of Science and Technology, Ning Wang Huazhong University of Science and Technology, Deqing Zou Huazhong University of Science and Technology, Yating Li Huazhong University of Science and Technology, Ruqian Zhang Huazhong University of Science and Technology, Shouhuai Xu University of Colorado Colorado Springs, Chao Zhang Tsinghua University, Hai Jin Huazhong University of Science and Technology Pre-print
12:15 7m Talk		Large Language Model for Vulnerability Detection: Emerging Results and Future Directions New Ideas and Emerging Results Xin Zhou Singapore Management University, Singapore, Ting Zhang Singapore Management University, David Lo Singapore Management University
12:22 7m Talk		Re(gEx\|DoS)Eval: Evaluating Generated Regular Expressions and their Proneness to DoS Attacks New Ideas and Emerging Results Mohammed Latif Siddiq University of Notre Dame, Jiahao Zhang , Lindsay Roney University of Notre Dame, Joanna C. S. Santos University of Notre Dame DOI Pre-print Media Attached