Write a Blog >>
ASE 2021
Sun 14 - Sat 20 November 2021 Australia
Thu 18 Nov 2021 21:20 - 21:40 at Koala - Repositories Chair(s): Zeqi Lin

People usually describe the key characteristics of software vulnerabilities in natural language mixed with domain-specific names and concepts. This textual nature poses a significant challenge for automatic analysis of vulnerabilities. Automatic extraction of key vulnerability aspects is highly desirable but demand significant effort to manually label data for model training. In this paper, we propose an unsupervised approach to label and extract important vulnerability concepts in textural vulnerability descriptions (TVDs). We focus on three types of phrase-based vulnerability concepts (root cause, attack vector and impact) as they are much more difficult to label and extract than name- or number-based entities (i.e., vendor, product and version). Our approach is based on a key observation that same-type of phrases, no matter how they differ in sentence structures and phrase expressions, usually share syntactically similar paths in the sentence paring trees. Therefore, we propose two path representations (absolute paths and relative paths) and use auto-encoder to encode such syntactic similarities. To address the discrete nature of our paths, we enhance traditional Variational Auto-encoder (VAE) with Gumble-Max trick for categorical data distribution, and thus creates a Categorical VAE (CaVAE). In the latent space of absolute and relative paths, we further FIt-TSNE and clustering techniques to generate clusters of same-type of concepts. Our evaluation confirms the effectiveness of our CaVAE for encoding path representations, and the accuracy of vulnerability concepts in the resulting clusters. In a concept classification task, our unsupervisedly labeled vulnerability concepts outperform the two manually labeled datasets from previous work.

Thu 18 Nov

Displayed time zone: Hobart change

21:00 - 22:00
RepositoriesResearch Papers at Koala
Chair(s): Zeqi Lin Microsoft Research, China
21:00
20m
Talk
Learning Domain-Specific Edit Operations from Model Repositories with Frequent Subgraph Mining
Research Papers
Christof Tinnes Saarland University, Timo Kehrer Humboldt University of Berlin, Mitchell Joblin Siemens AG, Uwe Hohenstein Siemens AG, Andreas Biesdorf Siemens AG, Sven Apel Saarland University
21:20
20m
Talk
Unsupervised Labeling and Extraction of Phrase-based Concepts in Vulnerability Descriptions
Research Papers
Sofonias Yitagesu Tianjin University, Zhenchang Xing Australian National University, Xiaowang Zhang Tianjin University, Zhiyong Feng Tianjin University, Xiaohong Li TianJin University, Linyi Han Tianjin University
21:40
20m
Talk
A Compositional Deadlock Detector for Android Java
Research Papers
James Brotherston , Paul Brunet University College London, Nikos Gorogiannis Facebook, Max Kanovich University College London