WildSync: Automated Fuzzing Harness Synthesis via Wild API Usage Recovery
Fuzzing stands as one of the most practical techniques to test software efficiently. When applying fuzzing to software library APIs, high-quality fuzzing harnesses are essential, enabling fuzzers to execute the APIs with precise sequences and function parameters. Although software developers commonly rely on manual efforts to create fuzzing harnesses, there has been a growing interest in automating this process. Existing works are often constrained in their scalability and effectiveness due to their reliance on compiler-based analysis or runtime execution traces, which require manual setup and configuration. Our investigation of multiple actively fuzzed libraries reveals that a large number of exported API functions externally used by various open-source projects still remain untested by existing harnesses or unit-test files. The lack of testing for these API functions increases the risk of vulnerabilities going undetected, potentially leading to security issues.
In order to address the lack of coverage affecting existing fuzzing methods, we propose a novel approach to automatically generate fuzzing harnesses by extracting usage patterns of untested functions from real-world scenarios, using techniques based on lightweight Abstract Syntax Tree parsing to extract API usage from external source code. Then, we integrate the usage patterns into existing harnesses to construct new ones covering these untested functions. We have implemented a prototype of this concept named WildSync, enabling the automatic synthesis of fuzzing harnesses for C/C++ libraries on OSS-Fuzz. In our experiments, WildSync successfully produced 469 new harnesses for 24 actively fuzzed libraries on OSS-Fuzz, and also extended to 3 wildly used libraries that can be later integrated into OSS-Fuzz. This results in a significant increase in test coverage spanning over 1.3k functions and nearly 20k lines of code, while also identifying 7 previously undetected bugs.
Fri 27 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:30 | Fuzzing and Search-Based TestingResearch Papers / Tool Demonstrations at Cosmos 3C Chair(s): Thuan Pham University of Melbourne | ||
16:00 25mTalk | ZTaint-Havoc: From Havoc Mode to Zero-Execution Fuzzing-Driven Taint Inference Research Papers Yuchong Xie Hong Kong University of Science and Technology, Wenhui Zhang Hunan University, Changsha, China, Dongdong She HKUST (The Hong Kong University of Science and Technology) DOI | ||
16:25 25mTalk | WildSync: Automated Fuzzing Harness Synthesis via Wild API Usage Recovery Research Papers DOI | ||
16:50 25mTalk | FANDANGO: Evolving Language-Based Testing Research Papers José Antonio Zamudio Amaya CISPA Helmholtz Center for Information Security, Marius Smytzek CISPA Helmholtz Center for Information Security, Andreas Zeller CISPA Helmholtz Center for Information Security Link to publication DOI | ||
17:15 15mDemonstration | XAVIER: Grammar-Based Testing for XML Injection Attacks Tool Demonstrations Paul Kalbitzer , José Antonio Zamudio Amaya CISPA Helmholtz Center for Information Security, Andreas Zeller CISPA Helmholtz Center for Information Security |
Cosmos 3C is the third room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.