Fuzzing is a highly effective method for uncovering software vulnerabilities, but analyzing the resulting data typically requires substantial manual effort. This is amplified by the fact that fuzzing campaigns often find a large number of crashing inputs, many of which share the same underlying bug. Crash deduplication is the task of finding such duplicate crashing inputs and thereby reducing the data that needs to be examined. Many existing deduplication approaches rely on comparing stack traces or other information that is collected when a program crashes. Although various metrics for measuring the similarity of such pieces of information have been proposed, many do not yield satisfactory deduplication results. In this work, we present GPTrace, a deduplication workflow that leverages a large language model to evaluate the similarity of various data sources associated with crashes by computing embedding vectors and supplying those as input to a clustering algorithm. We evaluate our approach on over 300 000 crashing inputs belonging to 50 ground truth labels from 14 different targets. The deduplication results produced by GPTrace show a noticeable improvement over hand-crafted stack trace comparison methods and even more complex state-of-the-art approaches that are less flexible.
Thu 16 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
11:00 - 12:30 | Testing and Analysis 8Research Track at Oceania IX Chair(s): Luca Di Grazia University of St. Gallen | ||
11:00 15mTalk | RusyFuzz: Unhandled Exception Guided Fuzzing for Rust OS Kernel Research Track Yuwei Liu Ant Group, Yanhao Wang Independent Researcher, Minghua Wang Ant Group, Lin Huang Ant Group, Purui Su Institute of Software/CAS China, Tao Wei Ant Group | ||
11:15 15mTalk | VDBFuzz: Understanding and Detecting Crash Bugs in Vector Database Management Systems Research Track Shenao Wang Huazhong University of Science and Technology, Zhao Liu 360 AI Security Lab, Yanjie Zhao Huazhong University of Science and Technology, Quanchen Zou 360 AI Security Lab, Haoyu Wang Huazhong University of Science and Technology | ||
11:30 15mTalk | GPTrace: Effective Crash Deduplication Using LLM Embeddings Research Track Patrick Herter Fraunhofer AISEC, Vincent Ahlrichs Fraunhofer AISEC, Ridvan Açilan Technical University of Munich, Julian Horsch Fraunhofer AISEC Pre-print Media Attached | ||
11:45 15mTalk | Is My RPC Response Reliable? Detecting RPC Bugs in Blockchain Client under Context Research Track Zhijie Zhong School of Software Engineering, Sun Yat-sen University, Yuhong Nan Sun Yat-sen University, Mingxi Ye Sun Yat-sen University, Qing Xue Sun Yat-sen University, Jiashui Wang Zhejiang University, Long Liu , Xinlei Ying , Zibin Zheng Sun Yat-sen University | ||
12:00 15mTalk | EchoFuzz: Empowering Smart Contract Fuzzing with Large Language Models Research Track Juanen Li Tsinghua University, Peng Qian Zhejiang University, Guanyan Li University of Oxford, Rui Wang Beijing Normal University, Peixin Wang East China Normal University, Zhiqing Tang Beijing Normal University, Fuchen Ma Tsinghua University, Yuanliang Chen Tsinghua University, Lun Zhang GoPlus Security | ||
12:15 15mTalk | StorFuzz: Using Data Diversity to Overcome Fuzzing Plateaus Research Track Leon Weiß Ruhr University Bochum, Tobias Holl Ruhr University Bochum, Kevin Borgolte Ruhr University Bochum Pre-print Media Attached | ||