Cornucopia: A Framework for Feedback Guided Generation of Binaries
Binary analysis or the ability to analyze binary code is an important capability required for many security and software engineering applications. Consequently, there are many binary analysis tech- niques and tools with varied capabilities. However, testing these tools requires a large, varied binary dataset with corresponding source-level information. In this paper, we present Cornucopia, an architecture agnostic automated framework that can generate a large number of semantically equivalent binaries from program source code. We exploit compiler optimizations and use feedback- guided learning to maximize the generation of unique binaries that correspond to the same program. Our evaluation shows that Cor- nucopia was able to generate 309K binaries across four archi- tectures (x86, x64, ARM, MIPS) with an average of 403 binaries for each program. Our experiments also revealed a large number (∼300) of issues with LLVM optimization scheduler resulting in compiler crashes. Our evaluation of four popular binary analysis tools angr, Ghidra, ida, and radare, using Cornucopia gener- ated binaries, revealed various issues with these tools. Specifically, we found 263 crashes in angr and one memory corruption issue in ida. Our differential testing on the analysis results revealed vari- ous semantic bugs in these tools. We also tested machine learning tools, Asm2Vec, SAFE, and Debin, that claim to capture binary semantics and show that they perform very poorly (e.g., Debin F1 score dropped to 12.9% from reported 63.1%) on Cornucopia generated binaries. In summary, our exhaustive evaluation shows that Cornucopia is an effective mechanism to generate binaries that can be used to test binary analysis techniques effectively.
Wed 12 OctDisplayed time zone: Eastern Time (US & Canada) change
13:30 - 15:30 | Technical Session 15 - Compilers and LanguagesJournal-first Papers / Research Papers / Industry Showcase at Banquet B Chair(s): Lingming Zhang University of Illinois at Urbana-Champaign | ||
13:30 20mResearch paper | Cornucopia: A Framework for Feedback Guided Generation of Binaries Research Papers Vidush Singhal Purdue University, Akul Abhilash Pillai Purdue University, Charitha Saumya Purdue University, Milind Kulkarni Purdue University, Aravind Machiry Purdue University | ||
13:50 20mPaper | CSMITHEDGE: More Effective Compiler Testing by Handling Undefined Behaviour Less Conservatively Journal-first Papers Karine Even-Mendoza Imperial College London, Cristian Cadar Imperial College London, UK, Alastair F. Donaldson Imperial College London | ||
14:10 20mResearch paper | Compiler Testing using Template Java ProgramsACM SIGSOFT Distinguished Paper Award Research Papers Zhiqiang Zang University of Texas at Austin, Nathan Wiatrek The University of Texas at Austin, Milos Gligoric University of Texas at Austin, August Shi University of Texas at Austin DOI Pre-print | ||
14:30 20mIndustry talk | Towards Understanding the Performance of Rust Industry Showcase Yuchen Zhang Stevens Institute of Technology, Yunhang Zhang The University of Utah, Georgios Portokalidis Stevens Institute of Technology, Jun Xu The University of Utah | ||
14:50 20mResearch paper | TransRepair: Context-aware Program Repair for Compilation ErrorsVirtual Research Papers Xueyang Li SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China, Shangqing Liu Nanyang Technological University, Ruitao Feng Nanyang Technological University, Guozhu Meng Institute of Information Engineering, Chinese Academy of Sciences, Xiaofei Xie Singapore Management University, Singapore, Kai Chen SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China, Yang Liu Nanyang Technological University | ||
15:10 20mResearch paper | Enriching Compiler Testing with Real Program from Bug ReportVirtual Research Papers Hao Zhong Shanghai Jiao Tong University |