Understanding the Limitations of C/C++ Binary Third-Party Library Detection Tool: An Empirical Study at Scale
This program is tentative and subject to change.
Detecting third-party libraries (TPLs) in C/C++ binaries is essential for ensuring software security and compliance, particularly in safety- and performance-critical domains. While numerous academic and commercial Software Composition Analysis (SCA) tools have been proposed, their true capabilities remain unclear due to the absence of large-scale benchmarks and systematic evaluation. Equally lacking is a deeper understanding of why these tools often underperform, which limits both research progress and practical adoption.
We address this gap with a large-scale study of binary SCA tools. We construct the largest publicly available benchmark to date, encompassing 38,228 test cases across 1,873 libraries drawn from a defined scope of 13,675 libraries. Using this benchmark, we systematically evaluate 11 representative tools, covering all open source research prototypes and widely adopted commercial solutions, across versions, architectures, and feature-database scales. Beyond aggregate performance metrics, we perform the first fine-grained, feature-level analysis to identify the intrinsic challenges of binary TPL detection. Our results show that existing tools perform unsatisfactorily, with average recall below 60% and precision around 75%. Feature-level analysis reveals fundamental obstacles: binaries lose most source-code features during compilation, and libraries exhibit high feature overlap due to functional similarity and dependency propagation. These findings explain current shortcomings, and we build on them to provide design recommendations, research directions, and practical guidance for managing open-source risks in binary software.
This program is tentative and subject to change.
Tue 7 JulDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | |||
11:00 20mTalk | Understanding the Limitations of C/C++ Binary Third-Party Library Detection Tool: An Empirical Study at Scale Research Papers CHENGYUE LIU , Zhengzi Xu Imperial Global Singapore, Kaixuan Li Nanyang Technological University, Wu Jiahui Nanyang Technological University, Singapore, Sihao Qiu Institute of Information Engineering Chinese Academy of Sciences & University of Chinese Academy of Sciences, China, Siyuan Li University of Chinese Academy of Sciences & Institute of Information Engineering Chinese Academy of Sciences, China, Siyang Xiong Desay SV Automotive Singapore Pte. Ltd., Yang Xiao Chinese Academy of Sciences, Yang Liu Nanyang Technological University | ||
11:20 20mTalk | Pig: Leveraging Large Language Models for Python Library Migrations Research Papers Miryeong Kang Korea University, Wonseok Oh Korea University, Gabin An Korea University, Hakjoo Oh Korea University | ||
11:40 20mTalk | Bringing Managed Language Support to WebAssembly with External Library Linking Research Papers Shuyao Jiang The Chinese University of Hong Kong, Ruiying Zeng Fudan University, Yangfan Zhou Fudan University, Michael Lyu The Chinese University of Hong Kong | ||
12:00 10mTalk | Package Dashboard: A Cross-Ecosystem Framework for Dual-Perspective Analysis of Software Packages Tool Demonstrations | ||
12:10 20mTalk | A Tuple-Oriented Sampling Method for Generating Small Pairwise Covering Arrays in Configurable Software Systems Research Papers Kaichen Chen South China University of Technology, Yi Xiang South China University of Technology, Haining Wang South China University of Technology, Jiatong Ma South China University of Technology, Fujian Feng Guizhou Minzu University, Miqing Li University of Birmingham, Han Huang Sun Yat-Sen University | ||