CG-Bench: Can Language Models Assist Call Graph Construction in the Real World?
This program is tentative and subject to change.
Language models for coding are shifting their focus from function-level to repository-level, with complex function invocations. We introduce CG-Bench , the first manually constructed benchmark that measures the capability of understanding call graphs for language models. This benchmark contains 104 call sites and the related code snippets associated with call chains from 7 representative open-source C/C++ projects. Language models are tasked to inference the calling targets from them. We evaluate four popular language models on CG-Bench . Surprisingly, all four models with different prompt settings achieve accuracy more than 50% and Deepseek-6.7b with few-shot prompts reaches 69.70%. We further show four findings from a micro study, which demonstrates that using language models for call graph construction is promising and the performance can be improved by prompt hacking, removing irrelevant information, etc.
This program is tentative and subject to change.
Wed 15 OctDisplayed time zone: Perth change
13:40 - 15:20 | LLMs for Program Analysis and Verification ILMPL at Orchid East Chair(s): Guannan Wei Tufts University | ||
13:40 15mTalk | Function Renaming in Reverse Engineering of Embedded Device Firmware with ChatGPT LMPL Puzhuo Liu Ant Group & Tsinghua University, Peng Di Ant Group & UNSW Sydney, Yu Jiang Tsinghua University | ||
13:55 15mTalk | Enhancing Semantic Understanding in Pointer Analysis Using Large Language Models LMPL Baijun Cheng Peking University, Kailong Wang Huazhong University of Science and Technology, Ling Shi Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology, Yao Guo Peking University, Ding Li Peking University, Xiangqun Chen Peking University | ||
14:10 15mTalk | Improving SAST Detection Capability with LLMs and Enhanced DFA LMPL Yuan Luo Tencent Security Yunding Lab, Zhaojun Chen Tencent Security Yunding Lab, Yuxin Dong Peking University, Haiquan Zhang Tencent Security Yunding Lab, Yi Sun Tencent Security Yunding Lab, Fei Xie Tencent Security Yunding Lab, Zhiqiang Dong Tencent Security Yunding Lab | ||
14:25 15mTalk | ClearAgent: Agentic Binary Analysis for Effective Vulnerability Detection LMPL Xiang Chen The Hong Kong University of Science and Technology, Anshunkang Zhou The Hong Kong University of Science and Technology, Chengfeng Ye The Hong Kong University of Science and Technology, Charles Zhang The Hong Kong University of Science and Technology | ||
14:40 15mTalk | CG-Bench: Can Language Models Assist Call Graph Construction in the Real World? LMPL Ting Yuan , Wenrui Zhang Huawei Technologies Co., Ltd, Dong Chen Huawei, Jie Wang Huawei Technologies Co., Ltd Pre-print | ||
14:55 20mTalk | Beyond Static Pattern Matching? Rethinking Automatic Cryptographic API Misuse Detection in the Era of LLMs LMPL |