BinQuery: A Novel Framework for Natural Language-Based Binary Code Retrieval
Binary Function Retrieval (BFR) is crucial in reverse engineering for identifying specific functions in binary code, especially those associated with malicious behavior or vulnerabilities. Traditional BFR methods rely on heuristics, often lacking the efficiency and adaptability needed for large-scale or diverse binary analysis tasks. To address these challenges, we present BinQuery, a Natural Language-based BFR (NL-based BFR) framework that uses natural language queries to retrieve relevant binary functions with improved flexibility and precision. BinQuery introduces innovative techniques to bridge information gaps between binary code and natural language, achieves fine-grained alignment for enhanced retrieval accuracy, and leverages Large Language Models (LLMs) to refine queries and generate diverse descriptions. Tested on the ViC and Magma datasets, BinQuery surpasses current state-of-the-art methods, achieving a 42.55% increase in recall@1 on ViC and a 4x improvement on Magma. Our framework marks a significant advancement for NL-based BFR, enhancing the efficacy of binary analysis for both general reverse engineering and vulnerability discovery.
Fri 27 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:30 | Binary Code Analysis and OptimizationResearch Papers / Tool Demonstrations at Cosmos 3B Chair(s): Andreas Zeller CISPA Helmholtz Center for Information Security | ||
14:00 25mTalk | BinQuery: A Novel Framework for Natural Language-Based Binary Code Retrieval Research Papers Bolun Zhang Institute of Information Engineering, Chinese Academy of Sciences. School of Cyber Security, University of Chinese Academy of Sciences, China, Zeyu Gao Tsinghua University, Hao Wang Tsinghua University, Yuxin Cui Institute for Network Sciences and Cyberspace, Tsinghua University, Siliang Qin Institute of Information Engineering, Chinese Academy of Sciences. School of Cyber Security, University of Chinese Academy of Sciences, China, Chao Zhang Tsinghua University, Kai Chen Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Beibei Zhao Institute of Information Engineering, Chinese Academy of Sciences. School of Cyber Security, University of Chinese Academy of Sciences, China DOI | ||
14:25 25mTalk | Wemby’s Web: Hunting for Memory Corruption in WebAssembly Research Papers Oussama Draissi University of Duisburg-Essen, Tobias Cloosters University of Duisburg-Essen, David Klein TU Braunschweig, Michael Rodler Amazon Web Services, Marius Musch TU Braunschweig, Martin Johns TU Braunschweig, Lucas Davi University of Duisburg-Essen DOI | ||
14:50 25mTalk | Doctor: Optimizing Container Rebuild Efficiency by Instruction Re-Orchestration Research Papers Zhiling Zhu Zhejiang University of Technology, Tieming Chen Zhejiang University of Technology, Chengwei Liu Nanyang Technological University, Han Liu The Hong Kong University of Science and Technology, Qijie Song Zhejiang University of Technology, Zhengzi Xu Nanyang Technological University; Imperial Global Singapore, Yang Liu Nanyang Technological University DOI | ||
15:15 15mDemonstration | ReGraph: A Tool for Binary Similarity Identification Tool Demonstrations |
Cosmos 3B is the second room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.