SANER 2025
Tue 4 - Fri 7 March 2025 Montréal, Québec, Canada

This program is tentative and subject to change.

Thu 6 Mar 2025 14:00 - 14:15 at M-1410 - Search & Similarity

Binary Code Similarity Detection (BCSD) is essential in various binary code security applications, enabling tasks such as vulnerability identification, malware analysis, and detection of code plagiarism. With the growing adoption of deep neural networks (DNNs) in BCSD, there has been significant progress in the identification and classification of similar code segments. However, DNN-based BCSD approaches often suffer from high false positive rates, because DNNs inevitably map different binary functions with complex structures and semantics to similar low-dimensional embeddings.

To alleviate this issue, this paper introduces BinEGA, a novel graph alignment-based approach to enhance the accuracy of DNN-based BCSD approaches. The main idea of BinEGA is to employ a general and low-cost equivalence check through lightweight graph alignment, allowing for the identification and elimination of semantically deviating functions among the top-k candidates retrieved by DNN-based BCSD approaches. During the graph alignment process, we first obtain the node embeddings according to structure and attribute feature. Then we employs pairwise comparison of these node embeddings to filter the false positives because binary code compiled from the same source code always shares similar basic blocks. Our experimental results demonstrate that BinEGA effectively enhances the performance of various edge-cutting DNN-based BCSD approaches across diverse scenarios. For instance, BinEGA significantly enhances RECALL@10 in the cross-optimization scenario for state-of-the-art (SOTA) approaches, with an average improvement of 29.2% for BinaryAI and 33.5% for jTrans. Moreover, BinEGA achieves 88.9% reduction in execution time compared to other enhancement techniques. In summary, this work provides a robust, generalizable, and efficient solution to improve the reliability of BCSD tools in real-world applications.

This program is tentative and subject to change.

Thu 6 Mar

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
Search & SimilarityResearch Papers / Industrial Track at M-1410
14:00
15m
Talk
BinEGA: Enhancing DNN-based Binary Code Similarity Detection through Efficient Graph Alignment
Research Papers
Shize Zhou Zhejiang University, Lirong Fu Hangzhou Dianzi University, Peiyu Liu Zhejiang University, Wenhai Wang Zhejiang University
14:15
15m
Talk
Evaluating the Effectiveness and Efficiency of Demonstration Retrievers in RAG for Code Tasks
Research Papers
Pengfei He University of Manitoba, Shaowei Wang University of Manitoba, Shaiful Chowdhury University of Manitoba, Tse-Hsun (Peter) Chen Concordia University
14:30
15m
Talk
Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios
Research Papers
Egor Shibaev Constructor University, JetBrains, Denis Sushentsev JetBrains, Yaroslav Golubev JetBrains Research, Aleksandr Khvorov JetBrains, ITMO University
Pre-print
14:45
15m
Talk
Industrial-Scale Neural Network Clone Detection with Disk-Based Similarity Search
Industrial Track
Gul Aftab Ahmed , Muslim Chochlov , Abdul Razzaq , James Vincent Patten , Yuanhua Han , Guoxian Lu , Jim Buckley Lero - The Irish Software Research Centre and University of Limerick, David Gregg Trinity College Dublin, Ireland