FORGE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil
co-located with ICSE 2026

Formal specification generation has recently drawn attention in software engineering as a way to improve program correctness without requiring manual annotations. Large Language Models (LLMs) have shown promise in this area, but early results reveal several limitations. Generated specifications often fail verification due to syntax errors, logical inaccuracies, or incomplete reasoning, especially in programs with loops or branching logic. Techniques like SpecGen and FormalBench attempt to address this through prompting and benchmarking, but they typically rely on static prompts and do not offer mechanisms for recovering from failure or adapting to different program structures. In this paper, we present AutoReSpec, a collaborative framework that combines open and closed-source LLMs for verifiable specification generation. AutoReSpec dynamically chooses an LLM pair and prompt configuration based on the structure of the input program. If the primary LLM fails to produce a valid output, a collaborative model is invoked, using validator feedback to refine and correct the specification. This two-stage design enables both speed and robustness. We evaluate AutoReSpec on a new benchmark of 72 real-world and synthetic Java programs. Our results show that it achieves 67 passes out of 72, outperforming SpecGen and FormalBench in both Success Probability and Completeness. Our experimental evaluation achieves a 58.2% success probability and a 69.2% completeness score, while cutting evaluation time by 26.89% on average compared to prior methods. Together, these results demonstrate that AutoReSpec offers a scalable, efficient, and reliable approach to LLM-based formal specification generation.

Sun 12 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Session II - Human-AI Collaboration, Multi-agent Systems, & BenchmarkingData and Benchmarking / Research Papers at Oceania I
Chair(s): Massimiliano Di Penta University of Sannio, Italy
14:00
12m
Talk
Reporting LLM Prompting in Automated Software Engineering: A Guideline Based on Current Practices and Expectations
Research Papers
Alexander Korn University of Duisburg-Essen, Lea Zaruchas University of Cologne, Chetan Arora Monash University, Andreas Metzger paluno – The Ruhr Institute for Software Technology, University of Duisburg-Essen, Sven Smolka University of Duisburg-Essen, Fanyu Wang Monash University, Andreas Vogelsang paluno – The Ruhr Institute for Software Technology, University of Duisburg-Essen, Alexander Korn University of Duisburg-Essen
14:12
12m
Talk
Impacts of Generative AI on Agile Teams' Productivity: A Multi-Case Longitudinal Study
Research Papers
Rafael Tomaz Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Paloma Guenes Pontifical Catholic University of Rio de Janeiro (PUC-Rio) | University of Bari (UniBa), Allysson Allex Araújo Federal University of Cariri, Maria Teresa Baldassarre Department of Computer Science, University of Bari , Marcos Kalinowski Pontifical Catholic University of Rio de Janeiro (PUC-Rio)
Pre-print
14:24
6m
Talk
Visual Loop: Bridging the Cognitive Gap in Software Development Through Visual-AI Collaboration
Research Papers
Luis F. Gomes Carnegie Mellon University, Xin Zhou Singapore Management University, Singapore, David Lo Singapore Management University, Rui Abreu Faculty of Engineering of the University of Porto, Portugal, Rui Abreu University of Porto
14:30
12m
Talk
AutoReSpec: A Framework for Generating Specification using Large Language ModelsVirtual Attendance
Research Papers
Ragib Shahariar Ayon Texas State University, Shibbir Ahmed Texas State University
Pre-print Media Attached
14:42
6m
Talk
CodeViz: Collaborative Multi-Agent System for Analytical and Visualization Tasks in Data Science
Research Papers
Sai Sanjna Chintakunta Pennsylvania State University, Nathalia Nascimento Pennsylvania State University, Everton Guimaraes Pennsylvania State University
14:48
6m
Talk
SEMODS: A Validated Dataset of Open-Source Software Engineering Models
Data and Benchmarking
Alexandra González Universitat Politècnica de Barcelona - BarcelonaTech (UPC), Xavier Franch Universitat Politècnica de Catalunya, Silverio Martínez-Fernández UPC-BarcelonaTech
DOI Pre-print
14:54
6m
Talk
Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering
Data and Benchmarking
Daniel Rodriguez-Cardenas William & Mary, Xiaochang Li William & Mary, Marcos Macedo Queen's University, Antonio Mastropaolo William and Mary, USA, Dipin Khati William & Mary, Yuan Tian Queen's University, Kingston, Ontario, Huajie Shao College of William & Mary, Denys Poshyvanyk William & Mary
15:00
6m
Talk
OmniBench-RAG: A Multi-Domain Evaluation Platform for Retrieval-Augmented Generation ToolsVirtual Attendance
Data and Benchmarking
Jiaxuan Liang Huazhong University of Science and Technology, China, shide zhou Huazhong University of Science and Technology, Kailong Wang Huazhong University of Science and Technology, JIAXUAN LIANG Huazhong University of Science and Technology