Seed Selection for Successful Fuzzing
Sat 17 Jul 2021 10:10 - 10:30 at ISSTA 2 - Session 28 (time band 3) Fuzzing and Runtime Analysis Chair(s): Michaël Marcozzi
Mutation-based greybox fuzzing—unquestionably the most
widely-used fuzzing technique—relies on a set of non-crashing seed
inputs (a corpus) to bootstrap the bug-finding process. When
evaluating a fuzzer, common approaches for constructing this corpus
include: (i) using an empty file; (ii) using a single seed representative
of the target's input format; or (iii) collecting a large number of seeds
(e.g., by crawling the Internet). Little thought is given to how this
seed choice affects the fuzzing process, and there is no consensus on which
approach is best (or even if a best approach exists).
To address this gap in knowledge, we systematically investigate and
evaluate how seed selection affects a fuzzer's ability to find
bugs in real-world software. This includes a systematic review
of seed selection practices used in both evaluation and deployment
contexts, and a large-scale empirical evaluation (over 33 CPU-years)
of six seed selection approaches. These six seed selection approaches
include three corpus minimization techniques (which select the
smallest subset of seeds that trigger the same range of instrumentation
data points as a full corpus).
Our results demonstrate that fuzzing outcomes vary significantly
depending on the initial seeds used to bootstrap the fuzzer, with
minimized corpora outperforming singleton, empty, and large (in the
order of thousands of files) seed sets. Consequently, we encourage
seed selection to be foremost in mind when evaluating/deploying
fuzzers, and recommend that (a) seed choice be carefully considered
and explicitly documented, and (b) never to evaluate fuzzers with only
a single seed.
Slides (ISSTA 2021 presentation.pdf) | 1.8MiB |
Thu 15 JulDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
00:20 - 01:20 | Session 6 (time band 2) FuzzingTechnical Papers at ISSTA 2 Chair(s): Lingming Zhang University of Illinois at Urbana-Champaign | ||
00:20 20mTalk | Seed Selection for Successful Fuzzing Technical Papers Adrian Herrera Australian National University; DST, Hendra Gunadi Australian National University, Shane Magrath DST, Michael Norrish CSIRO’s Data61; Australian National University, Mathias Payer EPFL, Tony Hosking Australian National University; CSIRO’s Data61 DOI Pre-print File Attached | ||
00:40 20mTalk | Gramatron: Effective Grammar-Aware Fuzzing Technical Papers DOI Pre-print Media Attached File Attached | ||
01:00 20mTalk | QFuzz: Quantitative Fuzzing for Side Channels Technical Papers DOI Pre-print Media Attached |
Sat 17 JulDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
09:30 - 10:50 | Session 28 (time band 3) Fuzzing and Runtime AnalysisTechnical Papers at ISSTA 2 Chair(s): Michaël Marcozzi Université Paris-Saclay, CEA, List | ||
09:30 20mTalk | Runtime Detection of Memory Errors with Smart Status Technical Papers Zhe Chen Nanjing University of Aeronautics and Astronautics, Chong Wang Nanjing University of Aeronautics and Astronautics, Junqi Yan Nanjing University of Aeronautics and Astronautics, Yulei Sui University of Technology Sydney, Jingling Xue UNSW DOI Media Attached | ||
09:50 20mTalk | UAFSan: An Object-Identifier-Based Dynamic Approach for Detecting Use-After-Free Vulnerabilities Technical Papers Binfa Gui Nanjing University of Science and Technology, Wei Song Nanjing University of Science and Technology, Jeff Huang Texas A&M University DOI Media Attached File Attached | ||
10:10 20mTalk | Seed Selection for Successful Fuzzing Technical Papers Adrian Herrera Australian National University; DST, Hendra Gunadi Australian National University, Shane Magrath DST, Michael Norrish CSIRO’s Data61; Australian National University, Mathias Payer EPFL, Tony Hosking Australian National University; CSIRO’s Data61 DOI Pre-print File Attached | ||
10:30 20mTalk | QFuzz: Quantitative Fuzzing for Side Channels Technical Papers DOI Pre-print Media Attached |