CoLeFunDa: Explainable Silent Vulnerability Fix Identification (ICSE 2023 - Technical Track)

Who

Jiayuan Zhou, Michael Pacheco, Jinfu Chen, Xing Hu, Xin Xia, David Lo, Ahmed E. Hassan

Track

ICSE 2023 Technical Track

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 May 2023 16:52 - 17:07 at Meeting Room 105 - Vulnerability testing and patching Chair(s): Cristian Cadar

Abstract

It is a common practice for OSS users to leverage security advisories to monitor the newly disclosed OSS vulnerabilities and the patch for vulnerability remediation. However, it is common that the vulnerability fixes are publicly available one week earlier and such a time gap may provide an advantage for attackers to develop exploits. Hence, it is important for OSS users to sense the fix as early as possible so that the vulnerability can be remediated before it is exploited. Due to the vulnerability disclosure policy, vulnerabilities are normally silently fixed, which means the fix should not indicate any vulnerability information. In this case, even if the fix is identified, it is hard for OSS users to understand the vulnerability and further evaluate the impact. Therefore, for better vulnerability early sensing, the identification of silent fixes and the corresponding explanations, e.g., the corresponding common weakness enumeration (CWE) and exploitability rating, are equally important.

However, it is challenging to identify silent fixes and provide explanations due to the limited and diverse data. To tackle the challenge, we propose \textit{CoLeFunDa}, which is a framework consisting of a \textbf{Co}ntrastive \textbf{Le}arner and FunDa, which is a novel approach for \textbf{Fun}ction change \textbf{Da}ta augmentation. FunDa first increases the fix data (i.e., code changes) at the function level with unsupervised and supervised strategies. Then the contrastive learner leverages contrastive learning to effectively train a function change encoder, FCBERT, from diverse fix data. Finally, we leverage FCBERT to further fine-tune three downstream tasks, i.e., automated silent fix identification, CWE category classification, and exploitability rating classification, respectively. Our result shows that \textit{CoLeFunDa} outperforms all the state-of-art baselines in all downstream tasks. We also conduct a survey to verify the effectiveness of \textit{CoLeFunDa} in practical usage. The result shows that \textit{CoLeFunDa} can categorize 62.5% (25 out of 40) CVEs with correct CWE categories within the top 2 recommendations.

Jiayuan Zhou

Huawei

Canada

Michael Pacheco

Centre for Software Excellence, Huawei

Jinfu Chen

Centre for Software Excellence, Huawei, Canada

Xing Hu

Zhejiang University

China

Xin Xia

Huawei

China

David Lo

Singapore Management University

Singapore

Ahmed E. Hassan

Queen’s University