ISSTA 2022
Mon 18 - Fri 22 July 2022 Online
Wed 20 Jul 2022 09:20 - 09:40 at ISSTA 2 - Session 2-4: Neural Networks, Learning, NLP B
Thu 21 Jul 2022 17:00 - 17:20 at ISSTA 2 - Session 3-6: Neural Networks, Learning, NLP F

Cross-platform binary analysis requires a common representation of binaries across platforms, on which a specific analysis can be performed. Recent work proposed to learn low-dimensional, numeric vector representations (i.e., embeddings) of disassembled binary code, and perform binary analysis in the embedding space. Unfortunately, however, existing techniques fall short in that they are either (i) specific to a single platform producing embeddings not aligned across platforms, or (ii) not designed to capture the rich contextual information available in a disassembled binary.

We present a novel deep learning-based method, XBA, which addresses the aforementioned problems. To this end, we first abstract binaries as typed graphs, dubbed binary disassembly graphs (BDGs), which encode control-flow and other rich contextual information of different entities found in a disassembled binary, including basic blocks, external functions called, and string literals referenced. We then formulate binary code representation learning as a graph alignment problem, i.e., finding the node correspondences between BDGs extracted from two binaries compiled for different platforms. XBA uses graph convolutional networks to learn the semantics of each node, (i) using its rich contextual information encoded in the BDG, and (ii) aligning its embeddings across platforms. Our formulation allows XBA to learn semantic alignments between two BDGs in a semi-supervised manner, requiring only a limited number of node pairs be aligned across platforms for training. Our evaluation shows that XBA can learn semantically-rich embeddings of binaries aligned across platforms without apriori platform-specific knowledge. By training our model only with 50% of the oracle alignments, XBA was able to predict, on average, 75% of the rest. Our case studies further show that the learned embeddings encode knowledge useful for cross-platform binary analysis.

Wed 20 Jul

Displayed time zone: Seoul change

08:40 - 09:40
Session 2-4: Neural Networks, Learning, NLP BTechnical Papers at ISSTA 2
08:40
20m
Talk
ASRTest: Automated Testing for Deep-Neural-Network-Driven Speech Recognition Systems
Technical Papers
Pin Ji Nanjing University, Yang Feng Nanjing University, Jia Liu Nanjing University, Zhihong Zhao Nanjing Tech Unniversity, Zhenyu Chen Nanjing University
DOI
09:00
20m
Talk
BET: Black-box Efficient Testing for Convolutional Neural Networks
Technical Papers
Wang Jialai Tsinghua University, Han Qiu Tsinghua University, Yi Rong Tsinghua University, Hengkai Ye Purdue University, Qi Li Tsinghua University, Zongpeng Li Tsinghua University, Chao Zhang Tsinghua University
DOI
09:20
20m
Talk
Improving Cross-Platform Binary Analysis using Representation Learning via Graph Alignment
Technical Papers
Geunwoo Kim University of California, Irvine, USA, Sanghyun Hong Oregon State University, Michael Franz University of California, Irvine, Dokyung Song Yonsei University, South Korea
DOI

Thu 21 Jul

Displayed time zone: Seoul change

16:20 - 17:40
Session 3-6: Neural Networks, Learning, NLP FTechnical Papers at ISSTA 2
16:20
20m
Talk
AEON: A Method for Automatic Evaluation of NLP Test Cases
Technical Papers
Jen-tse Huang The Chinese University of Hong Kong, Jianping Zhang The Chinese University of Hong Kong, Wenxuan Wang The Chinese University of Hong Kong, Pinjia He The Chinese University of Hong Kong, Shenzhen, Yuxin Su Sun Yat-sen University, Michael Lyu The Chinese University of Hong Kong
DOI
16:40
20m
Talk
HybridRepair: Towards Annotation-Efficient Repair for Deep Learning Models
Technical Papers
Yu Li The Chinese University of Hong Kong, Muxi Chen The Chinese University of Hong Kong, Xu, Qiang
DOI
17:00
20m
Talk
Improving Cross-Platform Binary Analysis using Representation Learning via Graph Alignment
Technical Papers
Geunwoo Kim University of California, Irvine, USA, Sanghyun Hong Oregon State University, Michael Franz University of California, Irvine, Dokyung Song Yonsei University, South Korea
DOI
17:20
20m
Talk
Human-in-the-Loop Oracle Learning for Semantic Bugs in String Processing Programs
Technical Papers
Charaka Geethal Monash University, Thuan Pham The University of Melbourne, Aldeida Aleti Monash University, Marcel Böhme MPI-SP, Germany and Monash University, Australia
DOI Pre-print