Finding A Needle in a Haystack: Automated Mining of Silent Vulnerability Fixes (ASE 2021 - Research Papers)

Who

Jiayuan Zhou, Michael Pacheco, Zhiyuan Wan, Xin Xia, David Lo, Yuan Wang, Ahmed E. Hassan

Track

ASE 2021 Research Papers

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Nov 2021 11:00 - 11:20 at Kangaroo - Vulnerability Chair(s): Xusheng Xiao

Abstract

Following the coordinated vulnerability disclosure model, a vulnerability in open source software (OSS) is suggested to be fixed ``silently'', without disclosing the fix until the vulnerability is disclosed. Yet, it is crucial for OSS users to be aware of vulnerability fixes as early as possible, as once a vulnerability fix is pushed to the source code repository, a malicious party could probe for the corresponding vulnerability to exploit it. In practice, OSS users often rely on the vulnerability disclosure information from security advisories (e.g., National Vulnerability Database) to sense vulnerability fixes. However, the time between the availability of a vulnerability fix and its disclosure can vary from days to months, and in some cases, even years. Due to manpower constraints and the lack of expert knowledge, it is infeasible for OSS users to manually analyze all code changes for vulnerability fix detection. Therefore, it is essential to identify vulnerability fixes automatically and promptly. In a first-of-its-kind study, we propose VulFixMiner, a Transformer-based approach, capable of automatically extracting semantic meaning from commit-level code changes to identify silent vulnerability fixes. We construct our model using sampled commits from 204 projects, and evaluate using the full set of commits from 52 additional projects. The evaluation results show that VulFixMiner outperforms various state-of-the-art baselines in terms of AUC (i.e., 0.81 and 0.73 on Java and Python dataset, respectively) and two effort-aware performance metrics (i.e., EffortCost, P$_{opt}$). Especially, with an effort of inspecting 5% of total LOC, VulFixMiner can identify 49% of total vulnerability fixes. Additionally, with manual verification of sampled commits that were identified as vulnerability fixes, but not marked as such in our dataset, we observe that 35% (29 out of 82) of the commits are for fixing vulnerabilities, indicating VulFixMiner is also capable of identifying unreported vulnerability fixes.

Jiayuan Zhou

Centre for Software Excellence, Huawei, Canada

Michael Pacheco

Centre for Software Excellence, Huawei

Zhiyuan Wan

Zhejiang University

China

Xin Xia

Huawei Software Engineering Application Technology Lab

China

David Lo

Singapore Management University

Singapore

Yuan Wang

Huawei Sweden Research Center

Ahmed E. Hassan

Queen's University

Canada