Semantic-Enhanced Indirect Call Analysis with Large Language Models (ASE 2024 - Research Papers)

Who

Baijun Cheng, Cen Zhang, Kailong Wang, Ling Shi, Yang Liu, Haoyu Wang, Yao Guo, Xiangqun Chen

Track

ASE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Oct 2024 10:30 - 10:45 at Compagno - Program analysis 2 Chair(s): Qingkai Shi

Abstract

In contemporary software development, the widespread use of indirect calls to achieve dynamic features poses challenges in constructing precise control flow graphs (CFGs), which further impacts the performance of downstream static analysis tasks. To tackle this issue, various types of indirect call analyzers have been proposed. However, they do not fully leverage the semantic information of the program, limiting their effectiveness in real-world scenarios.

To address these issues, this paper proposes Semantic-Enhanced Analysis (SEA), a new approach to enhance the effectiveness of indirect call analysis. Our fundamental insight is that for common programming practices, indirect calls often exhibit semantic similarity with their invoked targets. This semantic alignment serves as a supportive mechanism for static analysis techniques in filtering out false targets. Notably, contemporary large language models (LLMs) are trained on extensive code corpora, encompassing tasks such as code summarization, making them well-suited for semantic analysis. Specifically, SEA leverages LLMs to generate natural language summaries of both indirect calls and target functions from multiple perspectives. Through further analysis of these summaries, SEA can determine their suitability as caller-callee pairs. Experimental results demonstrate that SEA can significantly enhance existing static analysis methods by producing more precise target sets for indirect calls.

Baijun Cheng

Peking University

Cen Zhang

Nanyang Technological University

Singapore

Kailong Wang

Huazhong University of Science and Technology

China

Ling Shi

Nanyang Technological University

Singapore

Yang Liu

Nanyang Technological University

Singapore

Haoyu Wang

Huazhong University of Science and Technology

China

Yao Guo

Peking University

China

Xiangqun Chen

Peking University

China

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 30 Oct
Displayed time zone: Pacific Time (US & Canada) change

10:30 - 12:00	Program analysis 2Research Papers / Industry Showcase at Compagno Chair(s): Qingkai Shi Nanjing University

10:30 15m Talk		Semantic-Enhanced Indirect Call Analysis with Large Language Models Research Papers Baijun Cheng Peking University, Cen Zhang Nanyang Technological University, Kailong Wang Huazhong University of Science and Technology, Ling Shi Nanyang Technological University, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology, Yao Guo Peking University, Xiangqun Chen Peking University
10:45 15m Talk		Scaler: Efficient and Effective Cross Flow Analysis Research Papers Steven (Jiaxun) Tang University of Massachusetts Amherst, Mingcan Xiang University of Massachusetts Amherst, Yang Wang The Ohio State University, Bo Wu Colorado School of Mines, Jianjun Chen Bytedance, Tongping Liu ByteDance
11:00 15m Talk		AXA: Cross-Language Analysis through Integration of Single-Language Analyses Research Papers Tobias Roth TU Darmstadt \| ATHENE - National Research Center for Applied Cybersecurity, Darmstadt, Julius Näumann TU Darmstadt \| ATHENE - National Research Center for Applied Cybersecurity, Darmstadt, Dominik Helm University of Duisburg-Essen; TU Darmstadt; National Research Center for Applied Cybersecurity ATHENE, Sven Keidel TU Darmstadt, Mira Mezini TU Darmstadt; hessian.AI; National Research Center for Applied Cybersecurity ATHENE Link to publication DOI Pre-print
11:15 15m Talk		TypeFSL: Type Prediction from Binaries via Inter-procedural Data-flow Analysis and Few-shot Learning Research Papers Zirui Song The Chinese University of Hong Kong, YuTong Zhou The Chinese University of Hong Kong, Shuaike Dong Ant Group, Ke Zhang , Kehuan Zhang The Chinese University of Hong Kong
11:30 15m Talk		Experience Report on Applying Program Analysis Techniques for Mainframe Application Understanding Industry Showcase Shivali Agarwal IBM, Hiroaki Nakamura IBM Research Tokyo, Rami Katan IBM Research Haifa
11:45 15m Talk		Diagnosis via Proofs of Unsatisfiability for First-Order Logic with Relational Objects Research Papers Nick Feng University of Toronto, Lina Marsso University of Toronto, Marsha Chechik University of Toronto