CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph (ICSE 2025 - Industry Challenge Track)

Who

Hanxiang Xu, Wei Ma, Ting Zhou, Yanjie Zhao, Kai Chen, Qiang Hu, Yang Liu, Haoyu Wang

Track

ICSE 2025 Industry Challenge Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 1 May 2025 14:00 - 14:15 at 211 - Industry Challenge Presentations Chair(s): Federica Sarro, Xin Xia

Abstract

In recent years, the programming capabilities of large language models (LLMs) have garnered significant attention. Fuzz testing, a highly effective technique, plays a key role in enhancing software reliability and detecting vulnerabilities. However, traditional fuzz testing tools rely on manually crafted fuzz drivers, which can limit both testing efficiency and effectiveness. To address this challenge, we propose an automated fuzz testing method driven by a code knowledge graph and powered by an LLM-based intelligent agent system, referred to as CKGFuzzer. We approach fuzz driver creation as a code generation task, leveraging the knowledge graph of the code repository to automate the generation process within the fuzzing loop, while continuously refining both the fuzz driver and input seeds. The code knowledge graph is constructed through interprocedural program analysis, where each node in the graph represents a code entity, such as a function or a file. The knowledge graph-enhanced CKGFuzzer not only effectively resolves compilation errors in fuzz drivers and generates input seeds tailored to specific API usage scenarios, but also analyzes fuzz driver crash reports, assisting developers in improving code quality. By querying the knowledge graph of the code repository and learning from API usage scenarios, we can better identify testing targets and understand the specific purpose of each fuzz driver. We evaluated our approach using eight open-source software projects. The experimental results indicate that CKGFuzzer achieved an average improvement of 8.73% in code coverage compared to state-of-the-art techniques. Additionally, CKGFuzzer reduced the manual review workload in crash case analysis by 84.4% and successfully detected 11 real bugs across the tested libraries. Our research enhances the overall performance of fuzz testing by refining fuzz driver generation strategies and input seed analysis, offering a more effective solution for vulnerability remediation and software quality improvement.

Hanxiang Xu

Huazhong University of Science and Technology

Wei Ma

Ting Zhou

Huazhong University of Science and Technology

Yanjie Zhao

Huazhong University of Science and Technology

China

Kai Chen

Huazhong University of Science and Technology

Qiang Hu

The University of Tokyo

Japan

Yang Liu

Nanyang Technological University

Singapore

Haoyu Wang

Huazhong University of Science and Technology

China

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 1 May
Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30	Industry Challenge PresentationsIndustry Challenge Track at 211 Chair(s): Federica Sarro University College London, Xin Xia Huawei

14:00 15m Talk		CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge GraphAward Winner Industry Challenge Track Hanxiang Xu Huazhong University of Science and Technology, Wei Ma , Ting Zhou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Kai Chen Huazhong University of Science and Technology, Qiang Hu The University of Tokyo, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology
14:15 15m Talk		ClauseBench: Enhancing Software License Analysis with Clause-Level Benchmarking Industry Challenge Track Qiang Ke Huazhong University of Science and Technology, Xinyi Hou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:30 15m Talk		CodeMorph: Mitigating Data Leakage in Large Language Model Assessment Industry Challenge Track Hongzhou Rao Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Wenjie Zhu Huazhong University of Science and Technology, Ling Xiao Huazhong University of Science and Technology, Meizhen Wang Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:45 15m Talk		CommitShield: Tracking Vulnerability Introduction and Fix in Version Control SystemsSecurity Industry Challenge Track Zhaonan Wu Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Chen Wei MYbank, Ant Group, Zirui Wan Huazhong University of Science and Technology, Yue Liu Monash University, Haoyu Wang Huazhong University of Science and Technology
15:00 15m Talk		Exploring Large Language Models for Analyzing Open Source License Conflicts: How Far Are We? Industry Challenge Track Xing Cui Institute of Software, Chinese Academy of Sciences, Jingzheng Wu Institute of Software, The Chinese Academy of Sciences, Xiang Ling Institute of Software, Chinese Academy of Sciences, Tianyue Luo Institute of Software, Chinese Academy of Sciences, Mutian Yang Beijing ZhongKeWeiLan Technology Co.,Ltd., Wenxiang Ou Institute of Software, Chinese Academy of Sciences
15:15 15m Talk		OSS-LCAF: Open-Source Software License Conflict Analysis Framework Industry Challenge Track Aditya Kahol TCS Research, Anka Chandrahas Tummepalli TCS Research, Preethu Rose Anish TCS Research