3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers (ICSE 2025 - Research Track)

Who

Sarah Fakhoury, Markus Kuppe, Shuvendu K. Lahiri, Tahina Ramananandro, Nikhil Swamy

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 2 May 2025 14:00 - 14:15 at 212 - AI for Analysis 5 Chair(s): Tien N. Nguyen

Abstract

Improper parsing of attacker-controlled input is a leading source of software security vulnerabilities, especially when programmers transcribe informal format descriptions into efficient parsing logic in low-level, memory unsafe languages. Several researchers have proposed formal specification languages for data formats from which efficient code can be extracted. However, distilling informal requirements into formal specifications is challenging and, despite their benefits, new, formal languages are hard for people to learn and use.

In this work, we present 3DGen, a framework that makes use of AI agents to transform mixed informal input, including natural language documents and example inputs into format specifications in a language called 3D. To support humans in understanding and trusting the generated specifications, 3DGen uses symbolic methods to also synthesize test inputs that can be validated against an external oracle. Symbolic test generation also helps in distinguishing multiple plausible solutions. Through a process of repeated refinement, 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C.

We have evaluated 3DGen on 20 Internet standard formats, demonstrating the potential for AI-agents to produce formally verified C code at a non-trivial scale. A key enabler is the use of a domain-specific language to limit AI outputs to a class for which automated, symbolic analysis is tractable.

Link to Preprint

https://arxiv.org/pdf/2404.10362

Sarah Fakhoury

Microsoft Research

United States

Markus Kuppe

Microsoft Research

Shuvendu K. Lahiri

Microsoft Research

United States

Tahina Ramananandro

Microsoft Research

United States

Nikhil Swamy

Microsoft Research

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 2 May
Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30	AI for Analysis 5Research Track / New Ideas and Emerging Results (NIER) at 212 Chair(s): Tien N. Nguyen University of Texas at Dallas

14:00 15m Talk		3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers Research Track Sarah Fakhoury Microsoft Research, Markus Kuppe Microsoft Research, Shuvendu K. Lahiri Microsoft Research, Tahina Ramananandro Microsoft Research, Nikhil Swamy Microsoft Research Pre-print
14:15 15m Talk		Aligning the Objective of LLM-based Program Repair Research Track Junjielong Xu The Chinese University of Hong Kong, Shenzhen, Ying Fu Chongqing University, Shin Hwei Tan Concordia University, Pinjia He Chinese University of Hong Kong, Shenzhen Pre-print
14:30 15m Talk		Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models Research Track Aidan Z.H. Yang Carnegie Mellon University, Sophia Kolak Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University, Ruben Martins Carnegie Mellon University, Claire Le Goues Carnegie Mellon University
14:45 15m Talk		The Fact Selection Problem in LLM-Based Program Repair Research Track Nikhil Parasaram Uber Amsterdam, Huijie Yan University College London, Boyu Yang University College London, Zineb Flahy University College London, Abriele Qudsi University College London, Damian Ziaber University College London, Earl T. Barr University College London, Sergey Mechtaev Peking University
15:00 15m Talk		Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models Research Track Zhijie Wang University of Alberta, Zijie Zhou University of Illinois Urbana-Champaign, Da Song University of Alberta, Yuheng Huang University of Alberta, Canada, Shengmai Chen Purdue University, Lei Ma The University of Tokyo & University of Alberta, Tianyi Zhang Purdue University Pre-print
15:15 15m Talk		Beyond Syntax: How Do LLMs Understand Code? New Ideas and Emerging Results (NIER) Marc North Durham University, Amir Atapour-Abarghouei Durham University, Nelly Bencomo Durham University