Revisiting Optimization-Resilience Claims in Binary Diffing Tools: Insights from LLVM Peephole Optimization Analysis
Binary diffing technique aims to identify differences/similarities in executable files without source code access. Its potential applications in various software security tasks, such as vulnerability search, code clone detection, and malware analysis have generated a large body of literature over the past few years. A recurring theme in binary diffing research is to evaluate the resilience against the impact of compiler optimization, which is the most common source leading to syntactic differences in binary code. Despite claims by most binary diffing papers that they are immune to compiler optimization, recent studies have highlighted a pressing need for the research community to revisit these optimization-resilience claims.
In this paper, we investigate peephole optimization’s impact on binary diffing. Mainstream compilers feature a multitude of peephole optimization rules, facilitating local rewriting of input programs to replace instruction sequences within a window (i.e., peephole) with shorter and/or faster equivalents. Our research reveals that peephole optimization primarily affects binary code differences at the intra-procedural level, which contradicts the assumptions made by basic-block centric comparison approaches. We customized an LLVM translation validation tool to investigate the impact of peephole optimization from the overall optimization process. Our experimental results demonstrate 1) peephole optimization modifies binary code during the whole optimization process, and 2) no existing basic-block centric comparison tools can properly deal with all changes caused by peephole optimization, leading to further performance loss in downstream applications. Our study introduces a ``peephole-oriented'' test suite, designed to isolate and measure the impact of peephole optimizations on binary code. This suite provides a new perspective for evaluating the resilience of binary diffing tools against subtle, intra-procedural code changes, setting a new benchmark for future tool development. Our findings reveal critical insights that challenge existing assumptions in binary diffing, highlighting the need for more robust analysis techniques.
Tue 24 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:20 | Program Analysis 2Research Papers / Ideas, Visions and Reflections / Demonstrations at Pirsenteret 150 Chair(s): Martin Kellogg New Jersey Institute of Technology | ||
14:00 10mTalk | IceBear: A Fine-Grained Incremental Scheduler for C/C++ Static Analyzers Demonstrations | ||
14:10 20mTalk | Blended Analysis for Predictive Execution Research Papers Yi Li University of Texas at Dallas, Hridya Dhulipala University of Texas at Dallas, Aashish Yadavally University of Texas at Dallas, Xiaokai Rong University of Texas at Dallas, Shaohua Wang Central University of Finance and Economics, Tien N. Nguyen University of Texas at Dallas DOI | ||
14:30 20mTalk | Revisiting Optimization-Resilience Claims in Binary Diffing Tools: Insights from LLVM Peephole Optimization Analysis Research Papers Xiaolei Ren Macau University of Science and Technology, Mengfei Ren University of Alabama in Huntsville, Jeff Yu Lei University of Texas at Arlington, Jiang Ming Tulane University, USA DOI | ||
14:50 20mTalk | DyLin: A Dynamic Linter for Python Research Papers Aryaz Eghbali University of Stuttgart, Felix Burk University of Stuttgart, Michael Pradel University of Stuttgart DOI Pre-print | ||
15:10 10mTalk | Do you have 5 min? Improving Call Graph Analysis with Runtime Information Ideas, Visions and Reflections Jordan Samhi University of Luxembourg, Luxembourg, Marc Miltenberger Fraunhofer SIT; ATHENE, Marco Alecci University of Luxembourg, Steven Arzt Fraunhofer SIT; ATHENE, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg |
This room is located outside Clarion Hotel
This room is located in the Pirsenteret (The Pier Center) convention center. It is just outside the hotel, on the back, towards the fjord.
You should be able to go through the emergency exit at Clarion, just on the side of the Cosmos 3 wing, which will be bring you close to Pirsenteret.
The entrance to the center is from here:
https://maps.app.goo.gl/dU3qH6kAimXGBNHe7
Once inside, go all straight and you will find signage to reach the room. The room is known as room 150 inside the center.