Unlocking Low Frequency Syscalls in Kernel Fuzzing with Dependency-based RAG
Most coverage-guided kernel fuzzers test operating system kernels based on syscall sequence synthesis. However, there are still syscalls rarely or not covered (called low frequency syscalls, LFS) in a period of fuzzing, meaning the relevant code branches remain unexplored. This is due to the complexity dependencies of the LFS and mutation uncertainty, which makes it difficult for fuzzers to generate corresponding syscall sequences. Since many kernel fuzzers can dynamically learn syscall dependencies from the current corpus based on the choice table mechanism, providing comprehensive and high-quality seeds could help fuzzers cover LFS. However, constructing such seeds relies heavily on expert experience to resolve the syscall dependencies.
In this paper, we propose SyzGPT, the first kernel fuzzing framework to automatically generate effective seeds for LFS via Large Language Model (LLM). We leverage a dependency-based retrieval-augmented generation (DRAG) method to unlock the potential of LLM and design a series of steps to improve the effectiveness of the generated seeds. First, SyzGPT automatically extracts syscall dependencies from the existing documentation via LLM. Second, SyzGPT retrieves programs from the fuzzing corpus based on the dependencies to construct adaptive context for LLM. Last, SyzGPT periodically generates and repairs seeds with feedback to enrich the fuzzing corpus for LFS. We propose a novel set of evaluation metrics for seed generation in kernel domain. Our evaluation shows that SyzGPT can generate seeds with a high valid rate of 87.84% and can be extended to offline and fine-tuned LLMs. Compared to seven state-of-the-art kernel fuzzers, SyzGPT improves code coverage by 17.73%, LFS coverage by 58.00%, and vulnerability detection by 323.22% on average. Besides, SyzGPT independently discovered 26 unknown kernel bugs (10 are LFS-related), with 11 confirmed.
Thu 26 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:15 | |||
16:00 25mTalk | Program Feature-based Benchmarking for Fuzz Testing Research Papers Miao Miao The University of Texas at Dallas, Sriteja Kummita Fraunhofer Institute for Mechatronic Systems Design (Fraunhofer IEM), Eric Bodden Heinz Nixdorf Institute at Paderborn University; Fraunhofer IEM, Shiyi Wei University of Texas at Dallas DOI | ||
16:25 25mTalk | Unlocking Low Frequency Syscalls in Kernel Fuzzing with Dependency-based RAG Research Papers Zhiyu Zhang Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Longxing Li Institute of Information Engineering, Chinese Academy of Sciences, China, Ruigang Liang Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Kai Chen Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI | ||
16:50 25mTalk | Structure-Aware, Diagnosis-Guided ECU Firmware Fuzzing Research Papers Qicai Chen Fudan University, China, Kun Hu School of Computer Science, Fudan University, Sichen Gong Fudan University, China, Bihuan Chen Fudan University, kevin kong Fudan University, Haowen Jiang Fudan University, China, Bingkun Sun Fudan University, You Lu Fudan University, Xin Peng Fudan University DOI |
Cosmos 3A is the first room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.