SANER 2026
Tue 17 - Fri 20 March 2026 Limassol, Cyprus

Automated program repair (APR) approaches aim at generating patches for bugs that are indicated by failing test cases. Traditionally, search-based approaches that explore a large search space guided by a fitness function are the common approaches to address APR. More recently large language models (LLMs) have been proposed independently to achieve the goal of APR. In this paper, we propose and evaluate a combination of these two fundamentally different approaches.

Our experiment shows an increase of over 15 percentage points in number of fixable bugs that is only reachable by the combined approach and not reachable by the standalone search-based or LLM approaches. As a caveat, the effort required for the combined approach (runtime and cost of LLM-calls) is higher than of the standalone LLM-approach.

Wed 18 Mar

Displayed time zone: Athens change

11:00 - 12:30
11:00
15m
Talk
HieraTest: Hierarchical Dependency–Driven Framework with Multi-Strategy Repair for LLM-based Unit Test Generation
Research Track
Weichang Liu Zhejiang University, Junwei Zhang Zhejiang University, Xiaochun Zhu Insigma Hengtian Software LTD, Bo Zhou Northeastern University
11:15
15m
Talk
TestForge: A Benchmarking Framework for LLM-Based Test Case Generation
Research Track
Marco Vieira University of North Carolina at Charlotte, Bhavain Shah University of North Carolina at Charlotte, Priyam Ashish Shah University of North Carolina at Charlotte, Vineet Khadloya Salesforce
11:30
15m
Talk
RM -RF: Reward Model for Run-Free Unit Test Evaluation
Research Track
Elena Bruches Siberian Neuronets LLC, Daniil Grebenkin Siberian Neuronets LLC, Mikhail Klementev Siberian Neuronets LLC, Vadim Alperovich T-Technologies, Roman Derunets Siberian Neuronets LLC, Dari Baturova Siberian Neuronets LLC, Georgiy Mkrtchyan T-Technologies, Oleg Sedukhin Siberian Neuronets LLC, Ivan Bondarenko Novosibirsk State University, Nikolay Bushkov T-Technologies, Stanislav Moiseev T-Technologies
Pre-print
11:45
15m
Talk
Can We Classify Flaky Tests Using Only Test Code? An LLM-Based Empirical Study
Reproducibility Studies and Negative Results (RENE) Track
Alexander Berndt , Vekil Bekmyradov SAP, Rainer Gemulla University of Mannheim, Marcus Kessel University of Mannheim, Thomas Bach SAP, Sebastian Baltes Heidelberg University
12:00
7m
Talk
Integrating A Large Language Model Into Search-based Automated Program Repair
Short Papers and Posters Track
Adam Krafczyk University of Hildesheim, Klaus Schmid
12:07
7m
Talk
RisConFix: LLM-based Automated Repair of Risk-Prone Drone Configurations
Short Papers and Posters Track
Liping Han Nanjing University of Posts and Telecommunications, Tingting Nie Nanjing University of Posts and Telecommunications, Le Yu Nanjing University of Posts and Telecommunications, Mingzhe Hu Nanjing University of Posts and Telecommunications, Tao Yue Beihang University
12:14
7m
Talk
Leveraging Mutation Analysis for LLM-based Repair of Quantum Programs
Early Research Achievement (ERA) Track
Chihiro Yoshida The University of Osaka, Yuta Ishimoto Kyushu University, Olivier Nourry The University of Osaka, Masanari Kondo Kyushu University, Makoto Matsushita The University of Osaka, Yasutaka Kamei Kyushu University, Yoshiki Higo Osaka University
12:21
7m
Talk
AI-Assisted Semantic Modeling of Languages for Symbolic Execution Driven Unit Test Generation
Tool Demo Track
Mokshith Reddy Tanguturi , Atul Kumar IBM Research India, Nandakishore S Menon IBM Research India, Sridhar Chimalakonda Indian Institute of Technology Tirupati