Together We Are Better: LLM, IDE and Semantic Embedding to Assist Move Method Refactoring (ICSME 2025 - Research Papers Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Abhiram Bellur, Fraol Batole, Malinda Dilhara, Mohammed Raihan Ullah, Yaroslav Zharov, Timofey Bryksin, Kai Ishikawa, Haifeng Chen, Masaharu Morimoto, Shota Motoura, Takeo Hosomi, Tien N. Nguyen, Hridesh Rajan, Nikolaos Tsantalis, Danny Dig

Track

ICSME 2025 Research Papers Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 10 Sep 2025 13:55 - 14:10 at Case Room 3 260-055 - Session 3 - Debugging and Refactoring Chair(s): Ashkan Sami

Abstract

MoveMethod is a hallmark refactoring. Despite a plethora of research tools that recommend which methods to move and where, these recommendations do not align with how expert developers perform MoveMethod. Given the extensive training of Large Language Models and their reliance upon naturalness of code, they should expertly recommend which methods are misplaced in a given class and which classes are better hosts. Our formative study of 2016 LLM recommendations revealed that LLMs give expert suggestions, yet they are unreliable: up to 80% of the suggestions are hallucinations. We introduce the first LLM fully powered assistant for MoveMethod refactoring that automates its whole end-to-end lifecycle, from recommendation to execution. We designed novel solutions that automatically filter LLM hallucinations using static analysis from IDEs and a novel workflow that requires LLMs to be self-consistent, critique, and rank refactoring suggestions. As MoveMethod refactoring requires global, project-level reasoning, we solved the limited context size of LLMs by employing refactoring-aware retrieval augment generation (RAG). Our approach, MMpro, synergistically combines the strengths of the LLM, IDE, static analysis, and semantic relevance. In our thorough, multi-methodology empirical evaluation, we compare MMpro with the previous state-of-the-art approaches. MMpro significantly outperforms them: (i) on a benchmark widely used by other researchers, our Recall@1 and Recall@3 show a 1.7x improvement; (ii) on a corpus of 210 recent refactorings from Open-source software, our Recall rates improve by at least 2.4x. Lastly, we conducted a user study with 30 experienced participants who used MMpro to refactor their own code for one week. They rated 82.8% of MMpro recommendations positively. This shows that MMpro is both effective and useful.

Abhiram Bellur

University of Colorado Boulder

United States

Fraol Batole

Tulane University

United States

Malinda Dilhara

Amazon Web Services, USA

United States

Mohammed Raihan Ullah

University of Colorado Boulder

United States

Yaroslav Zharov

JetBrains Research

Germany

Timofey Bryksin

JetBrains Research

Cyprus

Kai Ishikawa

NEC Corporation

Japan

Haifeng Chen

NEC Laboratories America

United States

Masaharu Morimoto

NEC Corporation

Japan

Shota Motoura

NEC Corporation

Japan

Takeo Hosomi

NEC Corporation

Japan

Tien N. Nguyen

University of Texas at Dallas

United States

Hridesh Rajan

Tulane University

United States

Nikolaos Tsantalis

Concordia University

Canada

Danny Dig

University of Colorado Boulder, JetBrains Research

United States

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 10 Sep
Displayed time zone: Auckland, Wellington change

13:30 - 15:00	Session 3 - Debugging and RefactoringResearch Papers Track / Industry Track / Tool Demonstration Track / NIER Track at Case Room 3 260-055 Chair(s): Ashkan Sami Edinburgh Napier University

13:30 15m		Boosting Redundancy-based Automated Program Repair by Fine-grained Pattern Mining Research Papers Track Jiajun Jiang Tianjin University, Fengjie Li Tianjin University, Zijie Zhao Tianjin University, Zhirui Ye Tianjin University, Mengjiao Liu Tianjin University, Bo Wang Beijing Jiaotong University, Hongyu Zhang Chongqing University, Junjie Chen Tianjin University
13:45 10m		LadyBug: A GitHub Bot for UI-Enhanced Bug Localization in Mobile Apps Tool Demonstration Track Junayed Mahmud University of Central Florida, James Chen University of Toronto, Terry Achille University of Central Florida, Camilo Alvarez-Velez University of Central Florida, Darren Dean Bansil University of Central Florida, Patrick Ijieh University of Central Florida, Samar Karanch University of Central Florida, Nadeeshan De Silva William & Mary, Oscar Chaparro William & Mary, Andrian Marcus George Mason University, Kevin Moran University of Central Florida
13:55 15m		Together We Are Better: LLM, IDE and Semantic Embedding to Assist Move Method Refactoring Research Papers Track Abhiram Bellur University of Colorado Boulder, Fraol Batole Tulane University, Malinda Dilhara Amazon Web Services, USA, Mohammed Raihan Ullah University of Colorado Boulder, Yaroslav Zharov JetBrains Research, Timofey Bryksin JetBrains Research, Kai Ishikawa NEC Corporation, Haifeng Chen NEC Laboratories America, Masaharu Morimoto NEC Corporation, Shota Motoura NEC Corporation, Takeo Hosomi NEC Corporation, Tien N. Nguyen University of Texas at Dallas, Hridesh Rajan Tulane University, Nikolaos Tsantalis Concordia University, Danny Dig University of Colorado Boulder, JetBrains Research
14:10 10m		COB2PY - A Non-AI, Rule-Based COBOL to Python Translator Tool Demonstration Track Kowshik Reddy Challa Indian Institute of Technology, Tirupati, Sonith M V Indian Institute of Technology, Tirupati, Chiranjeevi B S Indian Institute of Technology Tirupati, Sridhar Chimalakonda Indian Institute of Technology Tirupati
14:20 10m		How Does Test Code Differ From Production Code in Terms of Refactoring? An Empirical Study NIER Track Kosei Horikawa Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Bin Lin Hangzhou Dianzi University, Kenji Fujiwara Nara Women’s University, Hajimu Iida Nara Institute of Science and Technology Pre-print
14:30 10m		How Much Can a Behavior-Preserving Changeset Be Decomposed into Refactoring Operations? NIER Track Kota Someya Institute of Science Tokyo, Lei Chen Institute of Science Tokyo, Michael J. Decker Bowling Green State University, Shinpei Hayashi Institute of Science Tokyo DOI Pre-print
14:40 15m		Governance Matters: Lessons from Restructuring the data.table OSS Project Industry Track Pedro Arantes RESHAPE LAB, Northern Arizona University, USA, Doris Amoakohene Northern Arizona University, Toby Hocking Université de Sherbrooke, Marco Gerosa Northern Arizona University, Igor Steinmacher RESHAPE LAB, Northern Arizona University, USA