InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation (ICSE 2025 - Research Track)

Who

Marcos Macedo, Yuan Tian, Pengyu Nie, Filipe Cogo, Bram Adams

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Apr 2025 17:00 - 17:15 at Canada Hall 1 and 2 - AI for SE 2 Chair(s): Tingting Yu

Abstract

Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance, and more. However, automating this process remains challenging due to many syntactic and semantic differences between PLs. Recent studies show that even advanced techniques such as large language models (LLMs), especially open-source LLMs, still struggle with the task.

Currently, code LLMs are trained with source code from multiple programming languages, thus presenting multilingual capabilities. In this paper, we investigate whether such capabilities can be harnessed to enhance code translation. To achieve this goal, we introduce InterTrans, an LLM-based automated code translation approach that, in contrast to existing approaches, leverages intermediate translations to bridge the syntactic and semantic gaps between source and target PLs. InterTrans contains two stages. It first utilizes a novel Tree of Code Translation (ToCT) algorithm to plan transitive intermediate translation sequences between a given source and target PL, then validates them in a specific order. We evaluate InterTrans with three open LLMs on three benchmarks (i.e., CodeNet, HumanEval-X, and TransCoder) involving six PLs. Results show an absolute improvement of 18.3% to 43.3% in Computation Accuracy (CA) for InterTrans over Direct Translation with 10 attempts. The best-performing variant of InterTrans(with the Magicoder LLM) achieved an average CA of 87.3%-95.4% on three benchmarks.

Marcos Macedo

Queen's University

Canada

Yuan Tian

Queen's University, Kingston, Ontario

Canada

Pengyu Nie

University of Waterloo

Canada

Filipe Cogo

Centre for Software Excellence, Huawei Canada

Canada

Bram Adams

Queen's University

Canada

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 30 Apr
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30	AI for SE 2Research Track / Journal-first Papers at Canada Hall 1 and 2 Chair(s): Tingting Yu University of Connecticut

16:00 15m Talk		Large Language Models for Safe Minimization Research Track Aashish Yadavally University of Texas at Dallas, Xiaokai Rong The University of Texas at Dallas, Phat Nguyen The University of Texas at Dallas, Tien N. Nguyen University of Texas at Dallas
16:15 15m Talk		LUNA: A Model-Based Universal Analysis Framework for Large Language Models Journal-first Papers Da Song University of Alberta, Xuan Xie University of Alberta, Jiayang Song University of Alberta, Derui Zhu Technical University of Munich, Yuheng Huang University of Alberta, Canada, Felix Juefei-Xu New York University, Lei Ma The University of Tokyo & University of Alberta, Yuheng Huang University of Alberta, Canada
16:30 15m Talk		Intention is All You Need: Refining Your Code from Your Intention Research Track Qi Guo Tianjin University, Xiaofei Xie Singapore Management University, Shangqing Liu Nanyang Technological University, Ming Hu Nanyang Technological University, Xiaohong Li Tianjin University, Lei Bu Nanjing University
16:45 15m Talk		RLCoder: Reinforcement Learning for Repository-Level Code Completion Research Track Yanlin Wang Sun Yat-sen University, Yanli Wang Sun Yat-sen University, Daya Guo , Jiachi Chen Sun Yat-sen University, Ruikai Zhang Huawei Cloud Computing Technologies, Yuchi Ma Huawei Cloud Computing Technologies, Zibin Zheng Sun Yat-sen University
17:00 15m Talk		InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation Research Track Marcos Macedo Queen's University, Yuan Tian Queen's University, Kingston, Ontario, Pengyu Nie University of Waterloo, Filipe Cogo Centre for Software Excellence, Huawei Canada, Bram Adams Queen's University
17:15 15m Talk		Toward a Theory of Causation for Interpreting Neural Code Models Journal-first Papers David Nader Palacio William & Mary, Alejandro Velasco William & Mary, Nathan Cooper William & Mary, Alvaro Rodriguez Universidad Nacional de Colombia, Kevin Moran University of Central Florida, Denys Poshyvanyk William & Mary Link to publication DOI Pre-print