Validating Network Protocol Parsers with Traceable RFC Document Interpretation (ISSTA 2025 - Research Papers)

Who

Mingwei Zheng, Danning Xie, Qingkai Shi, Chengpeng Wang, Xiangyu Zhang

Track

ISSTA 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 14:25 - 14:50 at Cosmos 3A - LLM-based Testing 1 Chair(s): Qingkai Shi

Abstract

Validating the correctness of network protocol implementations is highly challenging due to the oracle and the traceability problems. The former determines when a protocol implementation can be considered buggy, especially when the bugs do not cause any observable symptoms. The latter allows developers to understand how an implementation violates the protocol specification, thereby facilitating bug fixes. Unlike existing works that rarely take both problems into account, this work considers both and provides an effective solution using recent advances in large language models (LLMs). Our key observation is that network protocols are often released with structured specification documents, a.k.a. RFC documents, which can be systematically translated to formal protocol message specifications via an LLM. Such specifications, which may contain errors due to the hallucination of LLM, are used as a quasi-oracle to validate protocol parsers. The validation results in return gradually refine the oracle. Since the oracle is derived from the document, any bugs we find in a protocol implementation can be traced back to the document, thus addressing the traceability problem. We have extensively evaluated our approach using nine network protocols and their implementations written in C, Python, and Go. The results show that our approach outperforms the state-of-the-art and has detected 69 bugs with 35 confirmed. The project also demonstrates the potential for fully automating software validation based on natural language specifications, a process previously considered predominantly manual due to the need to understand specification documents and derive expected outputs for test inputs.

DOI

https://doi.org/10.1145/3728955

Mingwei Zheng

Purdue University

United States

Danning Xie

Purdue University

Qingkai Shi

Nanjing University

China

Chengpeng Wang

Purdue University

United States

Xiangyu Zhang

Purdue University

United States

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30	LLM-based Testing 1Research Papers / Tool Demonstrations at Cosmos 3A Chair(s): Qingkai Shi Nanjing University

14:00 25m Talk		A Large-scale Empirical Study on Fine-tuning Large Language Models for Unit Testing Research Papers ye shang Nanjing University, Quanjun Zhang School of Computer Science and Engineering, Nanjing University of Science and Technology, Chunrong Fang Nanjing University, Siqi Gu Nanjing University, Jianyi Zhou Huawei Cloud Computing Technologies Co., Ltd., Zhenyu Chen Nanjing University DOI
14:25 25m Talk		Validating Network Protocol Parsers with Traceable RFC Document Interpretation Research Papers Mingwei Zheng Purdue University, Danning Xie Purdue University, Qingkai Shi Nanjing University, Chengpeng Wang Purdue University, Xiangyu Zhang Purdue University DOI
14:50 25m Talk		Tratto: A Neuro-Symbolic Approach to Deriving Axiomatic Test Oracles Research Papers Davide Molinelli USI Lugano; Schaffhausen Institute of Technology, Alberto Martin-Lopez Software Institute - USI, Lugano, Elliott Zackrone University of Washington, Beyza Eken Sakarya University, Michael D. Ernst University of Washington, Mauro Pezze Università della Svizzera italiana (USI) and Università degli Studi di Milano Bicocca and CIT Constructor Institute of Technology DOI Pre-print
15:15 15m Demonstration		Kitten: A Simple Yet Effective Baseline for Evaluating LLM-Based Compiler Testing Techniques Tool Demonstrations Yuanmin Xie Tsinghua University, Zhenyang Xu University of Waterloo, Yongqiang Tian , Min Zhou , Xintong Zhou University of Waterloo, Chengnian Sun University of Waterloo

Information for Participants

Wed 25 Jun 2025 14:00 - 15:30 at Cosmos 3A - LLM-based Testing 1 Chair(s): Qingkai Shi

Info for room Cosmos 3A:

Cosmos 3A is the first room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.