LLM-Guided Fuzzing for Pathological Input Generation (SSBSE 2025 - Research Papers)

Sun 16 Nov 2025 Seoul, South Korea

co-located with ASE 2025

Who

Didier Ishimwe, ThanhVu Nguyen

Track

SSBSE 2025 Research Papers

Abstract

Pathological inputs can trigger worst-case algorithmic behavior, leading to excessive resource consumption and revealing performance bottlenecks. Generating such inputs is difficult because they follow highly specific patterns that random testing or traditional fuzzing rarely uncovers. Existing fuzzing techniques either rely on domain-specific knowledge, which limits generality, or apply mutations on binary representations of inputs that produce predominantly invalid inputs.

We present PathoGen, a feedback-driven fuzzing framework that integrates Large Language Models into evolutionary search to efficiently discover pathological inputs. By leveraging LLMs for candidate generation, PathoGen achieves high-quality input without requiring domain-specific knowledge, such as handcrafted grammars. An adaptive prompt template mechanism ensures robustness during input generation, while convergence tracking by input size directs computational resources toward those most likely to expose worst-case behaviors.

Evaluation across textbook algorithms and real-world applications shows that PathoGen uncovers expected theoretical complexity bounds, induces significant slowdowns in practical systems, and outperforms existing domain-independent fuzzers in efficiency and input validity while performing comparably to domain-specific fuzzers within their specialized domains.

LLM-Guided Fuzzing for Pathological Input Generation

Didier Ishimwe

George Mason University

United States

ThanhVu Nguyen

George Mason University

United States

Tracks