IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models (FSE 2025 - Research Papers)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

co-located with ISSTA 2025

Who

Sayem Mohammad Imtiaz, Astha Singh, Fraol Batole, Hridesh Rajan

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 23 Jun 2025 17:00 - 17:20 at Andromeda - Repairs Chair(s): Michael Pradel

Abstract

Not a day goes by without hearing about the impressive feats of large language models (LLMs), and equally, not a day passes without hearing about their challenges. LLMs are notoriously vulnerable to biases in their dataset, leading to issues such as toxicity, harmful responses, and factual inaccuracies. While domain-adaptive training has been employed to mitigate these issues, these techniques often address all model parameters indiscriminately during the repair process, resulting in poor repair quality and reduced model versatility. In this paper, drawing inspiration from fault localization via program slicing, we introduce a novel dynamic slicing-based intent-aware LLM repair strategy, IRepair. This approach selectively targets the most error-prone sections of the model for repair. Specifically, we propose dynamically slicing the model’s most sensitive layers that require immediate attention, concentrating repair efforts on those areas. This method enables more effective repairs with potentially less impact on the model’s overall versatility by altering a smaller portion of the model. Furthermore, dynamic selection allows for a more nuanced and precise model repair compared to a fixed selection strategy. We evaluated our technique on three models from the GPT2 and GPT-Neo families, with parameters ranging from 800M to 1.6B, in a toxicity mitigation setup. Our results show that IRepair repairs errors 43.6% more effectively while causing 46% less disruption to general performance compared to the closest baseline, direct preference optimization. Our empirical analysis also reveals that errors are more concentrated in a smaller section of the model, with the top 20% of layers exhibiting 773% more error density than the remaining 80%. This highlights the need for selective repair. Additionally, we demonstrate that a dynamic selection approach is essential for addressing errors dispersed throughout the model, ensuring a robust and efficient repair.

DOI

https://doi.org/10.1145/3715775

Sayem Mohammad Imtiaz

Iowa State University

United States

Astha Singh

Dept. of Computer Science, Iowa State University

Fraol Batole

Tulane University

United States

Hridesh Rajan

Tulane University

United States

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 23 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:00 - 18:00	RepairsResearch Papers / Journal First at Andromeda Chair(s): Michael Pradel University of Stuttgart

16:00 20m Talk		HornBro: Homotopy-like Method for Automated Quantum Program Repair Research Papers Siwei Tan Zhejiang University, Liqiang Lu Zhejiang University, Debin Xiang Zhejiang University, Tianyao Chu Zhejiang University, Congliang Lang Zhejiang University, Jintao Chen Zhejiang University, Xing Hu Zhejiang University, Jianwei Yin Zhejiang University DOI
16:20 20m Talk		RePurr: Automated Repair of Block-Based Learners' Programs Research Papers Sebastian Schweikl University of Passau, Gordon Fraser University of Passau DOI
16:40 20m Talk		Demystifying Memorization in LLM-based Program Repair via a General Hypothesis Testing Framework Research Papers Jiaolong Kong Singapore Management University, Xiaofei Xie Singapore Management University, Shangqing Liu Nanyang Technological University DOI
17:00 20m Talk		IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models Research Papers Sayem Mohammad Imtiaz Iowa State University, Astha Singh Dept. of Computer Science, Iowa State University, Fraol Batole Tulane University, Hridesh Rajan Tulane University DOI
17:20 20m Talk		Repairs and Breaks Prediction for Deep Neural Networks Journal First Yuta Ishimoto Kyushu University, Masanari Kondo Kyushu University, Lei Ma The University of Tokyo & University of Alberta, Naoyasu Ubayashi Waseda University, Yasutaka Kamei Kyushu University
17:40 20m Talk		Element-Based Automated DNN Repair with Fine-Tuned Masked Language Model Research Papers Xu Wang Beihang University; Zhongguancun Laboratory; Ministry of Education, Mingming Zhang Beihang University, Xiangxin Meng Beihang University, Jian Zhang Nanyang Technological University, Yang Liu Nanyang Technological University, Chunming Hu Beihang University DOI

Information for Participants

Mon 23 Jun 2025 16:00 - 18:00 at Andromeda - Repairs Chair(s): Michael Pradel

Info for room Andromeda:

Andromeda is located close to the restaurant and the bar, at the end of the corridor on the side of the bar.

From the registration desk, go towards the restaurant, turn left towards the bar, walk until the end of the corridor.