TCSE logo 
 Sigsoft logo
Sustainability badge
Wed 30 Apr 2025 11:30 - 11:45 at 212 - AI for Analysis 1 Chair(s): Denys Poshyvanyk

Formal verification is a promising method for producing reliable software, but the difficulty of manually writing verification proofs severely limits its utility in practice. Recent methods have automated some proof synthesis by guiding a search through the proof space using a theorem prover. Unfortunately, the theorem prover provides only the crudest estimate of progress, resulting in effectively undirected search. To address this problem, we create QEDCartographer, an automated proof-synthesis tool that combines supervised and reinforcement learning to more effectively explore the proof space. QEDCartographer incorporates the proofs’ branching structure, enabling reward-free search and overcoming the sparse reward problem inherent to formal verification. We evaluate QEDCartographer using the CoqGym benchmark of 68.5K theorems from 124 open-source Coq projects. QEDCartographer fully automatically proves 21.4% of the test-set theorems. Previous search-based proof-synthesis tools Tok, Tac, ASTactic, Passport, and Proverbot9001, which rely only on supervised learning, prove 9.6%, 9.8%, 10.9%, 12.5%, and 19.8%, respectively. Diva, which combines 62 tools, proves 19.2%. Comparing to the most effective prior tool, Proverbot9001, QEDCartographer produces 26% shorter proofs 27% faster, on average over the theorems both tools prove. Together, QEDCartographer and non-learning-based CoqHammer prove 31.8% of the theorems, while CoqHammer alone proves 26.6%. Our work demonstrates that reinforcement learning is a fruitful research direction for improving proof-synthesis tools’ search mechanisms.

Wed 30 Apr

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
AI for Analysis 1Research Track at 212
Chair(s): Denys Poshyvanyk William & Mary
11:00
15m
Talk
A Multiple Representation Transformer with Optimized Abstract Syntax Tree for Efficient Code Clone Detection
Research Track
TianChen Yu School of Software Engineering, South China University of Technology, Li Yuan School of Software Engineering, South China University of Technology, Guangzhou, China, Liannan Lin School of Software Engineering, South China University of Technology, Hongkui He School of Software Engineering, South China University of Technology
11:15
15m
Talk
Can an LLM find its way around a Spreadsheet?
Research Track
Cho-Ting Lee Virginia Tech, Andrew Neeser Virginia Tech, Shengzhe Xu Virginia Tech, Jay Katyan Virginia Tech, Patrick Cross Virginia Tech, Sharanya Pathakota Virginia Tech, Marigold Norman World Forest ID, John C. Simeone Simeone Consulting, LLC, Jaganmohan Chandrasekaran Virginia Tech, Naren Ramakrishnan Virginia Tech
11:30
15m
Talk
QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement LearningArtifact-Available
Research Track
Alex Sanchez-Stern University of Massachusetts at Amherst, Abhishek Varghese University of Massachusetts, Zhanna Kaufman University of Massachusetts, Shizhuo Zhang University of Illinois Urbana-Champaign, Talia Lily Ringer University of Illinois Urbana-Champaign, Yuriy Brun University of Massachusetts
Link to publication Pre-print
11:45
15m
Talk
TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference
Research Track
Chong Wang Nanyang Technological University, Jian Zhang Nanyang Technological University, Yiling Lou Fudan University, Mingwei Liu Fudan University, Weisong Sun Nanyang Technological University, Yang Liu Nanyang Technological University, Xin Peng Fudan University
12:00
15m
Talk
ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation
Research Track
Xue Jiang , Yihong Dong Peking University, Yongding Tao University of Electronic Science and Technology of China, Huanyu Liu Xidian University, Zhi Jin Peking University, Ge Li Peking University
12:15
15m
Talk
Rango: Adaptive Retrieval-Augmented Proving for Automated Software VerificationArtifact-FunctionalArtifact-AvailableArtifact-ReusableAward Winner
Research Track
Kyle Thompson University of California, San Diego, Nuno Saavedra INESC-ID and IST, University of Lisbon, Pedro Carrott Imperial College London, Kevin Fisher University of California San Diego, Alex Sanchez-Stern University of Massachusetts, Yuriy Brun University of Massachusetts, João F. Ferreira INESC-ID and IST, University of Lisbon, Sorin Lerner University of California at San Diego, Emily First University of California, San Diego
Link to publication Pre-print File Attached
:
:
:
: