ICPC 2023
Mon 15 - Tue 16 May 2023 Melbourne, Australia
co-located with ICSE 2023
A binary’s behavior is greatly influenced by how the compiler builds its source code. Although most compiler configuration details are abstracted away during compilation, recovering them is useful for reverse engineering and program comprehension tasks on unknown binaries, such as code similarity detection. We observe that previous work has thoroughly explored this on x86-64 binaries. However, there has been limited investigation of ARM binaries, which are increasingly prevalent.

In this paper, we extend previous work with a shallow-learning model that efficiently and accurately recovers compiler configuration properties for ARM binaries. We apply opcode and register-derived features, that have previously been effective on x86-64 binaries, to ARM binaries. Furthermore, we compare this work with Pizzolotto et al., a recent architecture-agnostic model that uses deep learning, whose dataset and code are available.

We observe that the lightweight features are reproducible on ARM binaries. We achieve over 99% accuracy, on par with state-of-the-art deep learning approaches, while achieving a 583-times speedup during training and 3,826-times speedup during inference. Finally, we also discuss findings of overfitting that was previously undetected in prior work.

Tue 16 May

Displayed time zone: Hobart change

13:45 - 15:15
Programming Languages, Types, and ComplexityDiscussion / Research / Replications and Negative Results (RENE) / Journal First at Meeting Room 106
Chair(s): Vittoria Nardone
13:45
9m
Full-paper
How Well Static Type Checkers Work with Gradual Typing? A Case Study on Python
Research
Wenjie Xu Nanjing University, Lin Chen Nanjing University, Chenghao Su Nanjing University, Yimeng Guo Nanjing University, Yanhui Li Nanjing University, Yuming Zhou Nanjing University, Baowen Xu Nanjing University
13:54
9m
Full-paper
Too Simple? Notions of Task Complexity used in Maintenance-based Studies of Programming Tools
Research
Patrick Rein University of Potsdam; Hasso Plattner Institute, Tom Beckmann Hasso Plattner Institute, Eva Krebs Hasso Plattner Institute (HPI), University of Potsdam, Germany, Toni Mattis University of Potsdam; Hasso Plattner Institute, Robert Hirschfeld University of Potsdam; Hasso Plattner Institute
14:03
9m
Full-paper
Path Complexity Predicts Code Comprehension Effort
Research
Sofiane Dissem Harvey Mudd College, Eli Pregerson Harvey Mudd College, Adi Bhargava Harvey Mudd College, Josh Cordova Harvey Mudd College, Lucas Bang Harvey Mudd College
14:12
5m
Short-paper
Revisiting Deep Learning for Variable Type Recovery
Replications and Negative Results (RENE)
Kevin Cao Vanderbilt University, Kevin Leach Vanderbilt University
Pre-print
14:17
9m
Talk
Programming language implementations for context-oriented self-adaptive systems
Journal First
Nicolás Cardozo Universidad de los Andes, Kim Mens Université catholique de Louvain, ICTEAM institute, Belgium
Link to publication DOI Media Attached
14:26
9m
Full-paper
Improving Code Search with Multi-Modal Momentum Contrastive Learning
Research
Zejian Shi Fudan University, Yun Xiong Fudan University, Yao Zhang Fudan University, Zhijie Jiang National University of Defense Technology, Jinjing Zhao National Key Laboratory of Science and Technology on Information System Security, Lei Wang National University of Defense Technology, Shanshan Li National University of Defense Technology
Pre-print
14:35
9m
Full-paper
Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries
Replications and Negative Results (RENE)
Jason Kim Georgia Tech, Daniel Genkin Georgia Tech, Kevin Leach Vanderbilt University
Pre-print
14:44
31m
Panel
Discussion 7
Discussion