Advancing Binary Code Similarity Detection via Context-Content Fusion and LLM Verification (ASE 2025 - Research Papers)

Who

Chaopeng Dong, Jingdong Guo, Shouguo Yang, Yi Li, Dongliang Fang, Yang Xiao, Yongle Chen, Limin Sun

Track

ASE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 14:00 - 14:10 at Grand Hall 5 - Security 5 Chair(s): Yanjie Jiang

Abstract

Binary Code Similarity Detection, essential for binary-code related tasks like vulnerability detection, has attracted increasing attention in recent years. However, existing methods frequently fall short of achieving both high precision and recall at scale, and their results often lack interpretability due to the neglect of function context and reliance on purely similarity-driven outputs. Our key insights are twofold: \emph{1) Binary functions are not self-contained; they depend on other code and data beyond their content to fulfill their functionalities. 2) Large language models (LLMs) excel not only at analyzing code but also at generating reasonable explanations.} Motivated by these insights, we propose a general BCSD framework, Co$^2$FuLL. We first systematically select stable and representative code and data features, along with their corresponding dependencies on the functions, to construct the function context. Then, by fusing function context with content similarities computed by the existing BCSD approach, we substantially narrow down the search space. Ultimately, we employ LLMs with a carefully designed prompt to verify the remaining candidates and produce clear, human-readable explanations. We conduct comprehensive experiments on a large function pool under varying compilation settings and after binary stripping. The results show that Co$^2$FuLL based on HermesSim and DeepSeek-V3 achieves 80.5% precision and 94.4% recall, improving the baseline HermesSim by 142.5% and 42.2%, respectively, providing an accurate and interpretable solution for BCSD.

Chaopeng Dong

Institute of Information Engineering, CAS, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China;

Jingdong Guo

Institute of Information Engineering, CAS, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China;

Shouguo Yang

Zhongguancun Laboratory, Beijing, China

Yi Li

Nanyang Technological University

Singapore

Dongliang Fang

Beijing Key Laboratory of IOT Information Security Technology, Institute of Information Engineering, CAS, China; School of Cyber Security, University of Chinese Academy of Sciences, China

China

Yang Xiao

Chinese Academy of Sciences

China

Yongle Chen

Taiyuan University of Technology, China

Limin Sun

Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences

China

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Nov
Displayed time zone: Seoul change

14:00 - 15:30	Security 5Research Papers at Grand Hall 5 Chair(s): Yanjie Jiang Peking University

14:00 10m Talk		Advancing Binary Code Similarity Detection via Context-Content Fusion and LLM Verification Research Papers Chaopeng Dong Institute of Information Engineering, CAS, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China;, Jingdong Guo Institute of Information Engineering, CAS, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China;, Shouguo Yang Zhongguancun Laboratory, Beijing, China, Yi Li Nanyang Technological University, Dongliang Fang Beijing Key Laboratory of IOT Information Security Technology, Institute of Information Engineering, CAS, China; School of Cyber Security, University of Chinese Academy of Sciences, China, Yang Xiao Chinese Academy of Sciences, Yongle Chen Taiyuan University of Technology, China, Limin Sun Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences
14:10 10m Talk		ACTaint: Agent-Based Taint Analysis for Access Control Vulnerabilities in Smart Contracts Research Papers Huarui Lin Zhejiang University, Zhipeng Gao Shanghai Institute for Advanced Study - Zhejiang University, Jiachi Chen Sun Yat-sen University, Xiang Chen Nantong University, Xiaohu Yang Zhejiang University, Lingfeng Bao Zhejiang University
14:20 10m Talk		AMPLE: Fine-grained File Access Policies for Server Applications Research Papers Seyedhamed Ghavamnia Bloomberg, Julien Vanegue Imperial College London; Bloomberg
14:30 10m Talk		Mockingbird: Efficient Excessive Data Exposures Detection via Dynamic Code Instrumentation Research Papers Chenxiao Xia Beijing Institute of Technology, Jiazheng Sun Fudan University, Jun Zheng Beijing Institute of Technology, Yu-an Tan Beijing Institute of Technology, Hongyi Su Beijing Institute of Technology
14:40 10m Talk		DrainCode: Stealthy Energy Consumption Attacks on Retrieval-Augmented Code Generation via Context Poisoning Research Papers Yanlin Wang Sun Yat-sen University, Jiadong Wu School of Software Engineering, Sun Yat-sen University, Tianyue Jiang Sun Yat-sen University, Mingwei Liu Sun Yat-Sen University, Jiachi Chen Sun Yat-sen University, Chong Wang Nanyang Technological University, Ensheng Shi Huawei, Xilin Liu Huawei Cloud, Yuchi Ma Huawei Cloud Computing Technologies, Hongyu Zhang Chongqing University, Zibin Zheng Sun Yat-sen University
14:50 10m Talk		Finding Insecure State Dependency in DApps via Multi-Source Tracing and Semantic Enrichment Research Papers Jingwen Zhang School of Software Engineering, Sun Yat sen University, Yuhong Nan Sun Yat-sen University, Wei Li School of Software Engineering, Sun Yat sen University, Kaiwen Ning Sun Yat-sen University, Zewei Lin Sun Yat-sen University, Zitong Yao School of Software Engineering, Sun Yat sen University, Yuming Feng Peng Cheng Laboratory, Weizhe Zhang Harbin Institute of Technology, Zibin Zheng Sun Yat-sen University
15:00 10m Talk		Better Safe than Sorry: Preventing Policy Violations through Predictive Root-Cause-Analysis for IoT Systems Research Papers Michael Norris Penn State University, Syed Rafiul Hussain Pennsylvania State University, Gang (Gary) Tan Pennsylvania State University
15:10 10m Talk		Backdoors in Code Summarizers: How Bad Is It? Research Papers Chenyu Wang Singapore Management University, Zhou Yang University of Alberta, Alberta Machine Intelligence Institute , Yaniv Harel Tel Aviv University, David Lo Singapore Management University Pre-print
15:20 10m Talk		ProfMal: Detecting Malicious NPM Packages by the Synergy between Static and Dynamic Analysis Research Papers Yiheng Huang Fudan University, Wen Zheng Fudan University, Susheng Wu Fudan University, Bihuan Chen Fudan University, You Lu Fudan University, Zhuotong Zhou Fudan University, Yiheng Cao Fudan University, Xiaoyu Li Fudan University, Xin Peng Fudan University