GlueTest: Testing Code Translation via Language Interoperability (ICSME 2024 - New Ideas and Emerging Results Track)

Who

Muhammad Salman Abid, Mrigank Pawagi, Sugam Adhikari, Xuyan Cheng, Ryed Badr, Md Wahiduzzaman, Vedant Rathi, Ronghui Qi, Choiyin Li, Lu Liu, Rohit Sai Naidu, Licheng Lin, Que Liu, Asif Zubayer Palak, Mehzabin Haque, Xinyu Chen, Darko Marinov, Saikat Dutta

Track

ICSME 2024 New Ideas and Emerging Results Track

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 11 Oct 2024 11:10 - 11:20 at Abineau - Session 12: Machine Learning in Software Engineering Chair(s): Mohammed Sayagh

Abstract

Code translation from one programming language to another has been a topic of interest for academia and industry for a long time, and has recently re-emerged with the advent of Large Language Models (LLMs). While progress has been made in translating small code snippets, tackling larger projects with intricate dependencies remains a challenging task. A significant challenge in automating such translations is validating the resulting code. Translating existing tests to the target language can introduce errors, yielding potentially misleading quality assurance even when all the translated tests pass.

We propose the idea of testing the translated code using the existing, untranslated tests written in the original programming language. The key to our idea is to leverage language interoperability to run code written in two different languages together. This partial translation approach offers two main benefits: (1) the ability to leverage original tests for validating translated code, not only from the project being translated but also from the clients using this project, and (2) the continuous maintainability and testability of the project during translation.

We evaluate our approach by translating from Java to Python two popular Java libraries, Apache Commons CLI and Apache Commons CSV, with 1209 lines of code (in 22 Java files) and 860 lines of code (in 10 Java files), respectively. Our implementation uses Oracle’s GraalVM framework for language interoperability. We successfully validate the translation using the original Java tests, not just from the CLI and CSV libraries themselves but also from client projects of these libraries (30 for CLI and 6 for CSV). Our approach is the first to systematically and semi-automatically validate translations for such non-trivial libraries.

Muhammad Salman Abid

Cornell University

United States

Mrigank Pawagi

Indian Institute of Science, Bengaluru

India

Sugam Adhikari

Islington College

Nepal

Xuyan Cheng

Dickinson College

United States

Ryed Badr

University of Illinois Urbana Champaign

United States

Md Wahiduzzaman

BRAC University

Bangladesh

Vedant Rathi

Adlai E Stevenson High School

United States

Ronghui Qi

Wuhan University

China

Choiyin Li

Po Leung Kuk Ngan Po Ling College

Hong Kong SAR China

Lu Liu

University of Washington

Rohit Sai Naidu

Dublin High School

United States

Licheng Lin

Zhejiang University

China

Que Liu

University of Shanghai for Science and Technology

China

Asif Zubayer Palak

BRAC University

Bangladesh

Mehzabin Haque

University of Dhaka

Bangladesh

Xinyu Chen

University of Illinois Urbana Champaign

United States

Darko Marinov

University of Illinois at Urbana-Champaign

United States

Saikat Dutta

Cornell University

United States

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 11 Oct
Displayed time zone: Arizona change

10:30 - 12:00	Session 12: Machine Learning in Software EngineeringTool Demo Track / Research Track / New Ideas and Emerging Results Track / Registered Reports Track at Abineau Chair(s): Mohammed Sayagh ETS Montreal, University of Quebec

10:30 15m		Can We Do Better with What We Have Done? Unveiling the Potential of ML Pipeline in NotebooksResearch Track Paper Research Track Yuangan Zou , Xinpeng Shan , Shiqi Tan , Shurui Zhou University of Toronto
10:45 10m		MergeRepair: Merging Task-Specific Adapters in Code LLMs for Automated Program RepairRegistered Reports Paper Registered Reports Track Meghdad Dehghan University of British Columbia, Jie JW Wu University of British Columbia (UBC), Fatemeh Hendijani Fard University of British Columbia, Ali Ouni ETS Montreal, University of Quebec Pre-print
10:55 15m		On the Use of Deep Learning Models for Semantic Clone DetectionResearch Track Paper Research Track Subroto Nag Pinku University of Saskatchewan, Debajyoti Mondal , Chanchal K. Roy University of Saskatchewan, Canada
11:10 10m		GlueTest: Testing Code Translation via Language InteroperabilityNIER Paper New Ideas and Emerging Results Track Muhammad Salman Abid Cornell University, Mrigank Pawagi Indian Institute of Science, Bengaluru, Sugam Adhikari Islington College, Xuyan Cheng Dickinson College, Ryed Badr University of Illinois Urbana Champaign, Md Wahiduzzaman BRAC University, Vedant Rathi Adlai E Stevenson High School, Ronghui Qi Wuhan University, Choiyin Li Po Leung Kuk Ngan Po Ling College, Lu Liu University of Washington, Rohit Sai Naidu Dublin High School, Licheng Lin Zhejiang University, Que Liu University of Shanghai for Science and Technology, Asif Zubayer Palak BRAC University, Mehzabin Haque University of Dhaka, Xinyu Chen University of Illinois Urbana Champaign, Darko Marinov University of Illinois at Urbana-Champaign, Saikat Dutta Cornell University
11:20 10m		Does Co-Development with AI Assistants Lead to More Maintainable Code? A Registered ReportRegistered Reports Paper Registered Reports Track Markus Borg CodeScene, Dave Hewett Equal Experts, Donald Graham Equal Experts, Noric Couderc Lund University, Emma Söderberg Lund University, Luke Church University of Cambridge \| Candela Inc, Dave Farley Continuous Delivery Pre-print
11:30 15m		Leveraging Large Vision-Language Model For Better Automatic Web GUI TestingResearch Track Paper Research Track Siyi Wang , Sinan Wang Southern University of Science and Technology, Yujia Fan , Xiaolei Li , Yepang Liu Southern University of Science and Technology
11:45 5m		StackRAG Agent: Improving Developer Answers with Retrieval-Augmented GenerationTool Demo Paper Tool Demo Track Davit Abrahamyan University of British Columbia, Fatemeh Hendijani Fard University of British Columbia