Understanding Formal Reasoning Failures in LLMs as Abstract Interpreters
This program is tentative and subject to change.
Large language models (LLMs) are increasingly used for program verification, and yet little is known about \emph{how} they reason about program semantics during this process. In this work, we focus on abstract interpretation based-reasoning for invariant generation and introduce two novel prompting strategies that aim to elicit such reasoning from LLMs. We evaluate these strategies across several state-of-the-art LLMs on 22 programs from the SV-COMP benchmark suite, widely used in software verification, and we analyze both the soundness of the generated invariants and the key thematic patterns in the models’ reasoning errors. This work aims to highlight new research opportunities at the intersection of large language models and program verification, both for applying LLMs to verification tasks and for advancing their reasoning capabilities in this application.
This program is tentative and subject to change.
Wed 15 OctDisplayed time zone: Perth change
16:00 - 17:40 | LLMs for Program Analysis and Verification IILMPL at Orchid East Chair(s): Zhuo Zhang Columbia University | ||
16:00 15mTalk | Hallucination-Resilient LLM-Driven Sound and Tunable Static Analysis LMPL | ||
16:15 20mTalk | Toward Repository-Level Program Verification with Large Language Models LMPL DOI | ||
16:35 15mTalk | Preguss: It Analyzes, It Specifies, It Verifies LMPL Zhongyi Wang Zhejiang University, China, Tengjie Lin Zhejiang University, Mingshuai Chen Zhejiang University, Mingqi Yang Zhejiang University, Haokun Li Peking University, Xiao Yi The Chinese University of Hong Kong, Shengchao Qin Xidian University, Jianwei Yin Zhejiang University | ||
16:50 20mTalk | A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants LMPL Barış Bayazıt University of Toronto, Yao Li Portland State University, Xujie Si University of Toronto Pre-print | ||
17:10 20mTalk | Understanding Formal Reasoning Failures in LLMs as Abstract Interpreters LMPL Jacqueline Mitchell University of Southern California, Brian Hyeongseok Kim University of Southern California, Chenyu Zhou University of Southern California, Chao Wang University of Southern California |