ICSE 2023 (series) / ICPC 2023 (series) / Replications and Negative Results (RENE) /
Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries
Tue 16 May 2023 14:35 - 14:44 at Meeting Room 106 - Programming Languages, Types, and Complexity Chair(s): Vittoria Nardone
A binary’s behavior is greatly influenced by how the compiler builds its source code. Although most compiler configuration details are abstracted away during compilation, recovering them is useful for reverse engineering and program comprehension tasks on unknown binaries, such as code similarity detection. We observe that previous work has thoroughly explored this on x86-64 binaries. However, there has been limited investigation of ARM binaries, which are increasingly prevalent.
In this paper, we extend previous work with a shallow-learning model that efficiently and accurately recovers compiler configuration properties for ARM binaries. We apply opcode and register-derived features, that have previously been effective on x86-64 binaries, to ARM binaries. Furthermore, we compare this work with Pizzolotto et al., a recent architecture-agnostic model that uses deep learning, whose dataset and code are available.
We observe that the lightweight features are reproducible on ARM binaries. We achieve over 99% accuracy, on par with state-of-the-art deep learning approaches, while achieving a 583-times speedup during training and 3,826-times speedup during inference. Finally, we also discuss findings of overfitting that was previously undetected in prior work.
Tue 16 MayDisplayed time zone: Hobart change
Tue 16 May
Displayed time zone: Hobart change
13:45 - 15:15 | Programming Languages, Types, and ComplexityDiscussion / Research / Replications and Negative Results (RENE) / Journal First at Meeting Room 106 Chair(s): Vittoria Nardone | ||
13:45 9mFull-paper | How Well Static Type Checkers Work with Gradual Typing? A Case Study on Python Research Wenjie Xu Nanjing University, Lin Chen Nanjing University, Chenghao Su Nanjing University, Yimeng Guo Nanjing University, Yanhui Li Nanjing University, Yuming Zhou Nanjing University, Baowen Xu Nanjing University | ||
13:54 9mFull-paper | Too Simple? Notions of Task Complexity used in Maintenance-based Studies of Programming Tools Research Patrick Rein University of Potsdam; Hasso Plattner Institute, Tom Beckmann Hasso Plattner Institute, Eva Krebs Hasso Plattner Institute (HPI), University of Potsdam, Germany, Toni Mattis University of Potsdam; Hasso Plattner Institute, Robert Hirschfeld University of Potsdam; Hasso Plattner Institute | ||
14:03 9mFull-paper | Path Complexity Predicts Code Comprehension Effort Research Sofiane Dissem Harvey Mudd College, Eli Pregerson Harvey Mudd College, Adi Bhargava Harvey Mudd College, Josh Cordova Harvey Mudd College, Lucas Bang Harvey Mudd College | ||
14:12 5mShort-paper | Revisiting Deep Learning for Variable Type Recovery Replications and Negative Results (RENE) Pre-print | ||
14:17 9mTalk | Programming language implementations for context-oriented self-adaptive systems Journal First Nicolás Cardozo Universidad de los Andes, Kim Mens Université catholique de Louvain, ICTEAM institute, Belgium Link to publication DOI Media Attached | ||
14:26 9mFull-paper | Improving Code Search with Multi-Modal Momentum Contrastive Learning Research Zejian Shi Fudan University, Yun Xiong Fudan University, Yao Zhang Fudan University, Zhijie Jiang National University of Defense Technology, Jinjing Zhao National Key Laboratory of Science and Technology on Information System Security, Lei Wang National University of Defense Technology, Shanshan Li National University of Defense Technology Pre-print | ||
14:35 9mFull-paper | Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries Replications and Negative Results (RENE) Pre-print | ||
14:44 31mPanel | Discussion 7 Discussion |