No Harness, No Problem: Oracle-guided Harnessing for Auto-generating C API Fuzzing Harnesses (ICSE 2025 - Research Track)

Who

Gabriel Sherman, Stefan Nagy

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Apr 2025 11:30 - 11:45 at 205 - Testing and QA 1 Chair(s): Jonathan Bell

Abstract

Library APIs are used by virtually every modern application and system, making them among today’s most security-critical software. In recent years, library bug-finding efforts have overwhelmingly adopted the powerful testing strategy of coverage-guided fuzzing. At its core, API fuzzing operates on harnesses: wrapper programs that initialize an API before feeding random inputs to its functions. Successful fuzzing demands correct and thorough harnesses, making manual harnessing challenging without sufficient domain expertise. To overcome this, recent strategies propose “learning” libraries’ intended usage to automatically generate their fuzzing harnesses. Yet, despite their high code coverage, resulting harnesses frequently miss key API semantics—bringing with them invalid, unrealistic, or otherwise-impossible data and call sequences—derailing fuzzing with false-positive crashes. Thus, without a precise, semantically-correct harnessing, many critical APIs will remain beyond fuzzing’s reach—leaving their hidden vulnerabilities ripe for attackers.

This paper introduces Oracle-guided Harnessing: a technique for fully-automatic, semantics-aware API fuzzing harness synthesis. At a high level, Oracle-guided Harnessing mimics the trial-and-error process of manual harness creation—yet automates it via fuzzing. Specifically, we leverage information from API headers to mutationally stitch-together candidate harnesses; and evaluate their validity via a set of Correctness Oracles: compilation, execution, and changes in coverage. By keeping— and further mutating—only correct candidates, our approach produces a diverse set of semantically-correct harnesses for complex, real-world libraries in as little as one hour.

We integrate Oracle-guided Harnessing as a prototype, OGHARN; and evaluate it alongside today’s leading fully-automatic harnessing approach, Hopper, and a plethora of developer-written harnesses from OSS-Fuzz. Across 20 real-world APIs, OGHARN outperforms developer-written harnesses by a median 14% code coverage, while uncovering 31 and 30 more vulnerabilities than both Hopper and developer-written harnesses, respectively—with zero false-positive crashes. Of the 41 new vulnerabilities found by OGHARN, all 41 are confirmed by developers—40 of which are since fixed—with many found in APIs that, until now, lacked harnesses whatsoever.

Gabriel Sherman

University of Utah

Stefan Nagy