ICSE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil

This program is tentative and subject to change.

Wed 15 Apr 2026 12:15 - 12:30 at Oceania IX - Testing and Analysis 1 Chair(s): Michael Pradel

Large Language Models (LLMs) are increasingly integrated into software systems as automated decision-making components. These systems rely on instruction prompts written in natural language to encode complex workflows. However, debugging these prompts when LLMs produce undesired outputs remains challenging due to their black-box nature and the impracticality of manually inspecting large, complex inputs. Unlike traditional software, LLMs provide no access to execution paths or intermediate states, making it difficult to identify which input fragments are responsible for unexpected behavior.

This paper investigates whether delta debugging can be effectively applied to identify and isolate problematic parts of LLM inputs that lead to undesired outputs. We introduce semantic markers as an instrumentation technique that embeds unique identifiers in LLM inputs and extracts traceability information from chain-of-thought reasoning. We systematically evaluate whether these markers accurately identify causal input fragments and enable delta debugging to isolate minimal subsets responsible for incorrect outputs.

Through experiments on a benchmark representing development scenarios and case studies from production systems, we demonstrate that delta debugging with semantic markers can systematically pinpoint problematic input fragments in both development and production settings. Our investigation shows that this approach transforms prompt debugging from an ad-hoc manual process into a systematic methodology, enabling engineers to efficiently identify and address the root causes of unexpected LLM behavior in real-world applications.

This program is tentative and subject to change.

Wed 15 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30
Testing and Analysis 1SE In Practice (SEIP) / Research Track at Oceania IX
Chair(s): Michael Pradel CISPA Helmholtz Center for Information Security
11:00
15m
Talk
BFix: Automated Safe Memory-Leak Fixing for Binary CodeVirtual Attendance
Research Track
Wen Zhang University of Georgia, Botang Xiao University of Georgia, Qingchen Kong University of Georgia, Boyang Yi University of Georgia, Suxin Ji University of Georgia, USA, Yage Hu University of Georgia, Songlan Wang University of Georgia, Wenwen Wang University of Georgia
11:15
15m
Talk
Learning without Forgetting: Towards Continual learning of Fault Localization Models in Industrial Software SystemsVirtual Attendance
Research Track
Chun Li Nanjing University, Hui Li Samsung Electronics (China) R&D Centre, Zhong Li Nanjing University, Minxue Pan Nanjing University, Xuandong Li Nanjing University
Media Attached File Attached
11:30
15m
Talk
Memory-Efficient Large Language Models for Program Repair with Semantic-Guided Patch GenerationVirtual Attendance
Research Track
Thanh Le-Cong Singapore University of Technology and Design, Singapore, Xuan-Bach D. Le University of Melbourne, Toby Murray University of Melbourne
Media Attached
11:45
15m
Talk
Addressing Test Flakiness: Practical Approaches in a Database-Reliant Industrial System
SE In Practice (SEIP)
George Vegelien Delft University of Technology, Carolin Brandt Delft University of Technology, Bas Graaf Exact, Arie van Deursen TU Delft
Pre-print
12:00
15m
Talk
XTrace: A Non-Invasive Dynamic Tracing Framework for Android Applications in Production
SE In Practice (SEIP)
Qi Hu ByteDance, Jiangchao Liu ByteDance, Lin Zhang ByteDance, Edward Jiang ByteDance, Xin Yu ByteDance
12:15
15m
Talk
Delta Debugging for LLM-integrated Systems
SE In Practice (SEIP)
Hao-Nan Zhu University of California, Davis, Muhammad Numair Mansur Amazon Web Services, Martin Schäf Amazon Web Services, Zeya Chen Amazon Web Services, Tancrède Lepoint Amazon, Willem Visser Amazon Web Services
Hide past events