Unsupervised Labeling and Extraction of Phrase-based Concepts in Vulnerability Descriptions (ASE 2021 - Research Papers)

Write a Blog >>

Sun 14 - Sat 20 November 2021 Australia

Who

Sofonias Yitagesu, Zhenchang Xing, Xiaowang Zhang, Zhiyong Feng, Xiaohong Li, Linyi Han

Track

ASE 2021 Research Papers

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Nov 2021 21:20 - 21:40 at Koala - Repositories Chair(s): Zeqi Lin

Abstract

People usually describe the key characteristics of software vulnerabilities in natural language mixed with domain-specific names and concepts. This textual nature poses a significant challenge for automatic analysis of vulnerabilities. Automatic extraction of key vulnerability aspects is highly desirable but demand significant effort to manually label data for model training. In this paper, we propose an unsupervised approach to label and extract important vulnerability concepts in textural vulnerability descriptions (TVDs). We focus on three types of phrase-based vulnerability concepts (root cause, attack vector and impact) as they are much more difficult to label and extract than name- or number-based entities (i.e., vendor, product and version). Our approach is based on a key observation that same-type of phrases, no matter how they differ in sentence structures and phrase expressions, usually share syntactically similar paths in the sentence paring trees. Therefore, we propose two path representations (absolute paths and relative paths) and use auto-encoder to encode such syntactic similarities. To address the discrete nature of our paths, we enhance traditional Variational Auto-encoder (VAE) with Gumble-Max trick for categorical data distribution, and thus creates a Categorical VAE (CaVAE). In the latent space of absolute and relative paths, we further FIt-TSNE and clustering techniques to generate clusters of same-type of concepts. Our evaluation confirms the effectiveness of our CaVAE for encoding path representations, and the accuracy of vulnerability concepts in the resulting clusters. In a concept classification task, our unsupervisedly labeled vulnerability concepts outperform the two manually labeled datasets from previous work.

Sofonias Yitagesu

Tianjin University

Zhenchang Xing

Australian National University

Australia

Xiaowang Zhang

Tianjin University

China

Zhiyong Feng

Tianjin University

Xiaohong Li

TianJin University

China

Linyi Han

Tianjin University

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Nov
Displayed time zone: Hobart change

21:00 - 22:00	RepositoriesResearch Papers at Koala Chair(s): Zeqi Lin Microsoft Research, China

21:00 20m Talk		Learning Domain-Specific Edit Operations from Model Repositories with Frequent Subgraph Mining Research Papers Christof Tinnes Saarland University, Timo Kehrer Humboldt University of Berlin, Mitchell Joblin Siemens AG, Uwe Hohenstein Siemens AG, Andreas Biesdorf Siemens AG, Sven Apel Saarland University
21:20 20m Talk		Unsupervised Labeling and Extraction of Phrase-based Concepts in Vulnerability Descriptions Research Papers Sofonias Yitagesu Tianjin University, Zhenchang Xing Australian National University, Xiaowang Zhang Tianjin University, Zhiyong Feng Tianjin University, Xiaohong Li TianJin University, Linyi Han Tianjin University
21:40 20m Talk		A Compositional Deadlock Detector for Android Java Research Papers James Brotherston , Paul Brunet University College London, Nikos Gorogiannis Facebook, Max Kanovich University College London