A First Look at the Effect of Deep Learning inCoverage-guided Fuzzing (ASE 2021 - Late Breaking Results) - ASE 2021

Write a Blog >>

Sun 14 - Sat 20 November 2021 Australia

Who

Siqi Li, Yun Lin, Xiaofei Xie, Yuekang Li, Xiaohong Li, Weimin Ge, Yang Liu, Jin Song Dong

Track

ASE 2021 Late Breaking Results

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 18 Nov 2021 10:06 - 10:08 at Kangaroo - LBR + DS Poster (2) (Thursday 21:00 - 00:00) Chair(s): Xiaoyin Wang

Abstract

Fuzzing has been a widely-used technique for discovering software vulnerabilities. Many existing fuzzers leverage coverage-feedback to evolve seeds to maximize (optimize) program branch coverage. Recently, some techniques propose to train deep learning models to predict the branch coverage of an arbitrary input. Those techniques have proved their success in improving coverage and discovering bugs under different experimental settings. However, deep learning models, usually as a black magic box, are notoriously lack of explanation. Moreover, their performance can be sensitive to the collected runtime coverage information for training, indicating potentially unstable performance. To order to understand how reliable and why the deep learning models can be used for fuzzing, To this end, in this work we conduct a systematic and extensive empirical study on 4 types of deep learning models across 6 projects to reproduce the actual performance of deep learning fuzzers, analyze the advantages and disadvantages of deep learning in the process of fuzzing applications, and explore the future direction of the combination of the two. Our empirical results reveal that the deep learning models can only be effective in very limited scenarios, which is largely restrained by training data imbalance, dependant labels, model over-generalization, and the insufficient expressiveness of the state-of-the-art models. Consequently, the estimated gradients by the models to cover a branch can be less helpful in many scenarios.

Siqi Li

Tianjin University

China

Yun Lin

National University of Singapore

Singapore

Xiaofei Xie

Kyushu University

Japan

Yuekang Li

Nanyang Technological University

Xiaohong Li

TianJin University

China

Weimin Ge

Tianjin University

Yang Liu

Nanyang Technological University

Jin Song Dong

National University of Singapore

Singapore

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 18 Nov
Displayed time zone: Hobart change

	10:00 - 11:00	LBR + DS Poster (2) (Thursday 21:00 - 00:00)Late Breaking Results / Doctoral Symposium at Kangaroo Chair(s): Xiaoyin Wang University of Texas at San Antonio

	10:00 2m Talk		API Compatibility Issue Detection, Testing and Analysis for Android Apps Doctoral Symposium Tarek Mahmud Texas State University File Attached
	10:02 2m Talk		Towards the generation of machine learning defect reports Doctoral Symposium Tuan Dung Lai Deakin University Pre-print File Attached
	10:04 2m Talk		DSInfoSearch: Supporting experimentation process of data scientists Doctoral Symposium Shangeetha Sivasothy Applied Artificial Intelligence Institute, Deakin University File Attached
	10:06 2m Talk		A First Look at the Effect of Deep Learning inCoverage-guided Fuzzing Late Breaking Results Siqi Li Tianjin University, Yun Lin National University of Singapore, Xiaofei Xie Kyushu University, Yuekang Li Nanyang Technological University, Xiaohong Li TianJin University, Weimin Ge Tianjin University, Yang Liu Nanyang Technological University, Jin Song Dong National University of Singapore
	10:08 2m Talk		Counterexample Guided Inductive Repair of Reactive Contracts Late Breaking Results Soha Hussein University of Minnesota, USA / Ain Shams University, Egypt, Vaibhav Sharma University of Minnesota, USA, Stephen McCamant University of Minnesota, USA, Sanjai Rayadurgam University of Minnesota, Mats Heimdahl University of Minnesota
	10:10 2m Talk		AST-Transformer: Encoding Abstract Syntax TreesEfficiently for Code Summarization Late Breaking Results Ze Tang Software Institute, Nanjing University, Chuanyi Li Software Institute, Nanjing University, Jidong Ge , Xiaoyu Shen Alexa AI, Amazon, Zheling Zhu Software Institute, Nanjing University, Bin Luo Software Institute, Nanjing University
	10:12 2m Talk		An Automated Pipeline for Privacy Leak Analysis of Android Applications Doctoral Symposium Yifan Zhou The University of Adelaide File Attached
	10:14 2m Talk		Detecting Adversarial Samples with Graph-Guided Testing Late Breaking Results Zuohui Chen Zhejiang University of Technology, Renxuan Wang Zhejiang University of Technology, Jingyang Xiang Zhejiang University of Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Xin Xia Huawei Software Engineering Application Technology Lab, Shouling Ji Zhejiang University, Qi Xuan Zhejiang University of Technology, Xiaoniu Yang Zhejiang University of Technology
	10:16 2m Talk		Using Static Analysis to Address Microservice Architecture Reconstruction Late Breaking Results Vincent Bushong Baylor University, Dipta Das Baylor University, Abdullah Al Maruf Baylor University, Tomas Cerny Baylor University
	10:18 2m Talk		Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms Late Breaking Results Elizabeth Forest James Cook University, Anne Swinbourne James Cook University, Trina Myers Queensland University of Technology, Mitchell Scovell James Cook University Link to publication
	10:20 2m Talk		Business Process Extraction Using Static Analysis Late Breaking Results Rofiqul Islam Baylor University, Tomas Cerny Baylor University
	10:22 2m Talk		Binary Code Similarity Detection Doctoral Symposium Zian Liu Swinburne University of Technology; Data61, CSIRO, Chao Chen James Cook University, Jun Zhang Digital Research & Innovation Capability Platform, Swinburne University of Technology, Dongxi Liu Data61, CSIRO, Muhammad Ejaz Ahmed Data61, CSIRO, Yang Xiang Digital Research & Innovation Capability Platform, Swinburne University of Technology File Attached
	10:24 2m Talk		Improving Mutation-Based Fault Localization with Plausible-code Generating Mutation Operators Late Breaking Results Juyoung Jeon Handong Global University, Shin Hong Handong Global University
	10:26 2m Talk		Using Version Control and Issue Tickets to detect Code Debt and Economical Cost Late Breaking Results Abdullah Al Maruf Baylor University, Noah Lambaria Baylor University, Amr Elsayed Baylor University, Tomas Cerny Baylor University File Attached
	10:28 2m Talk		Human-in-the-Loop XAI-enabled Vulnerability Detection, Investigation, and Mitigation Late Breaking Results Tien N. Nguyen University of Texas at Dallas, Kim-Kwang Raymond Choo University of Texas at San Antonio
	10:30 2m Talk		A Prediction Model for Software Requirements Change Impact Doctoral Symposium Kareshna Zamani PhD candidate File Attached
	10:32 2m Talk		Leveraging Code Clones and Natural Language Processing for Log Statement Prediction Doctoral Symposium Sina Gholamian University of Waterloo Pre-print