A Large Scale Study of AI-based Binary Function Similarity Detection Techniques for Security Researchers and Practitioners (ASE 2025 - Research Papers)

Who

Jingyi Shi, Yufeng Chen, Yang Xiao, Yuekang Li, Zhengzi Xu, Sihao Qiu, Chi Zhang, Keyu Qi, Yeting Li, Xingchu Chen, Yanyan Zou, Yang Liu, Wei Huo

Track

ASE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 12:10 - 12:20 at Grand Hall 5 - Security 4 Chair(s): Saeid Tizpaz-Niari

Abstract

Binary Function Similarity Detection (BFSD) is a foundational technique in software security, underpinning a wide range of applications including vulnerability detection, malware analysis. Recent advances in AI-based BFSD tools have led to significant performance improvements. However, existing evaluations of these tools suffer from three key limitations: a lack of in-depth analysis of performance-influencing factors, an absence of realistic application analysis, and reliance on small-scale or low-quality datasets.

In this paper, we present the first large-scale empirical study of AI-based BFSD tools to address these gaps. We construct two high-quality and diverse datasets: BinAtlas, comprising 12,453 binaries and over 7 million functions for capability evaluation; and BinAres, containing 12,291 binaries and 54 real-world 1-day vulnerabilities for evaluating vulnerability detection performance in practical IoT firmware settings. Using these datasets, we evaluate nine representative BFSD tools, analyze the challenges and limitations of existing BFSD tools, and investigate the consistency among BFSD tools. We also propose an actionable strategy for combining BFSD tools to enhance overall performance (an improvement of 13.4%). Our study not only advances the practical adoption of BFSD tools but also provides valuable resources and insights to guide future research in scalable and automated binary similarity detection.

Jingyi Shi

Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences

Yufeng Chen

Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences

China

Yang Xiao

Chinese Academy of Sciences

China

Yuekang Li

UNSW

Australia

Zhengzi Xu

Imperial Global Singapore

Singapore

Sihao Qiu

Institute of Information Engineering Chinese Academy of Sciences & University of Chinese Academy of Sciences, China

Chi Zhang

Institute of Information Engineering, CAS; School of Cyber Security, UCAS

Keyu Qi

Institute of Information Engineering, CAS; School of Cyber Security, UCAS

Yeting Li

Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences

China

Xingchu Chen

Institute of Information Engineering, CAS; School of Cyber Security, UCAS

Yanyan Zou

Institute of Information Engineering, Chinese Academy of Sciences

Yang Liu

Nanyang Technological University

Singapore

Wei Huo

Institute of Information Engineering at Chinese Academy of Sciences

China

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Nov
Displayed time zone: Seoul change

11:00 - 12:30	Security 4Research Papers / Journal-First at Grand Hall 5 Chair(s): Saeid Tizpaz-Niari University of Illinois Chicago

11:00 10m Talk		When Does Wasm Malware Detection Fail? A Systematic Analysis of Their Robustness to Evasion Research Papers Taeyoung Kim Sungkyunkwan University, Sanghak Oh Sungkyunkwan University, Kiho Lee ETRI (Electronics and Telecommunications Research Institute), South Korea, Weihang Wang University of Southern California, Yonghwi Kwon University of Maryland, Sanghyun Hong Oregon State University, Hyoungshick Kim Sungkyunkwan University
11:10 10m Talk		RFCAudit: AI Agent for Auditing Protocol Implementations Against RFC Specifications Research Papers Mingwei Zheng Purdue University, Chengpeng Wang Purdue University, Xuwei Liu Purdue University, USA, Jinyao Guo Purdue University, Shiwei Feng Purdue University, Xiangyu Zhang Purdue University
11:20 10m Talk		Time to separate from StackOverflow and match with ChatGPT for encryption Journal-First Ehsan Firouzi TU Clausthal, Mohammad Ghafari TU Clausthal
11:30 10m Talk		Demystifying Cross-Language C/C++ Binaries: A Robust Software Component Analysis Approach Research Papers Meiqiu Xu Northeastern University, China, Ying Wang Northeastern University, Wei Tang HUA WEI, Xian Zhan HUA WEI, Shing-Chi Cheung Hong Kong University of Science and Technology, Hai Yu Northeastern University, China, Zhiliang Zhu Northeastern University, China
11:40 10m Talk		Detecting Various DeFi Price Manipulations with LLM Reasoning Research Papers Juantao Zhong Lingnan University, Daoyuan Wu Lingnan University, Ye Liu Singapore Management University, Maoyi Xie Nanyang Technological University, Yang Liu Nanyang Technological University, Yi Li Nanyang Technological University, Ning Liu City University of Hong Kong
11:50 10m Talk		Uncovering Prompt Elements: Cloning System Prompts from Behavioral Traces Research Papers Yi Qian State Key Laboratory for Novel Software Technology, Nanjing University, Pengfei State Key Laboratory for Novel Software Technology, Nanjing University, Hao Wu , Ligeng Chen Honor Device Co., Ltd, Bing Mao Nanjing University
12:00 10m Talk		CRYPTBARA: Dependency-Guided Detection of Python Cryptographic API Misuses Research Papers seogyeong cho Korea University, Seungeun Yu Korea University, Seunghoon Woo Korea University
12:10 10m Talk		A Large Scale Study of AI-based Binary Function Similarity Detection Techniques for Security Researchers and Practitioners Research Papers Jingyi Shi Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences, Yufeng Chen Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yang Xiao Chinese Academy of Sciences, Yuekang Li UNSW, Zhengzi Xu Imperial Global Singapore, Sihao Qiu Institute of Information Engineering Chinese Academy of Sciences & University of Chinese Academy of Sciences, China, Chi Zhang Institute of Information Engineering, CAS; School of Cyber Security, UCAS, Keyu Qi Institute of Information Engineering, CAS; School of Cyber Security, UCAS, Yeting Li Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xingchu Chen Institute of Information Engineering, CAS; School of Cyber Security, UCAS, Yanyan Zou Institute of Information Engineering, Chinese Academy of Sciences, Yang Liu Nanyang Technological University, Wei Huo Institute of Information Engineering at Chinese Academy of Sciences
12:20 10m Talk		FirmProj: Detecting Firmware Leakage in IoT Update Processes via Companion App Analysis Research Papers Wenzhi Li Shandong University, Jialong Guo Shandong University, Jiongyi Chen National University of Defense Technology, Fan Li Shandong University, Yujie Xing Shandong University, Yanbo Xu Shanghai Jiao Tong University, Shishuai Yang Shandong University, Wenrui Diao Shandong University