How is Google using AI for internal code migrations? (ICSE 2025 - Software Engineering in Practice (SEIP))

Who

Stoyan Nikolov, Daniele Codecasa, Anna Sjovall, Maxim Tabachnyk, Siddharth Taneja, Celal Ziftci, Satish Chandra

Track

ICSE 2025 SE In Practice (SEIP)

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 2 May 2025 12:00 - 12:15 at Canada Hall 1 and 2 - AI for SE 3 Chair(s): Ying Zou

Abstract

In recent years, there has been a tremendous interest in using generative AI, and particularly large language models (LLMs) in software engineering; indeed several companies now offer commercially available tools, and many large companies also have created their own ML-based tools for their software engineers. While the use of ML for common tasks such as code completion is available in commodity tools, there is a growing interest in application of LLMs for more bespoke purposes. One such purpose is code migration.

This article is an experience report on using LLMs for code migrations at Google. It is not a research study, in the sense that we do not carry out comparisons against other approaches or evaluate research questions/hypotheses. Rather, we share our experiences in applying LLM-based code migration in an enterprise context across a range of migration cases, in the hope that other industry practitioners will find our insights useful. Many of these learnings apply to any bespoke application of ML in software engineering. We see evidence that the use of LLMs can reduce the time needed for migrations significantly, and can reduce barriers to get started and complete migration programs.

Stoyan Nikolov

Google, Inc.

Daniele Codecasa

Google, Inc.

Anna Sjovall

Google, Inc.

Maxim Tabachnyk

Google

Siddharth Taneja

Google, Inc.

Celal Ziftci

Google

United States

Satish Chandra

Google, Inc

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 2 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	AI for SE 3New Ideas and Emerging Results (NIER) / Journal-first Papers / Research Track / SE In Practice (SEIP) at Canada Hall 1 and 2 Chair(s): Ying Zou Queen's University, Kingston, Ontario

11:00 15m Talk		A First Look at Conventional Commits Classification Research Track Qunhong Zeng Beijing Institute of Technology, Yuxia Zhang Beijing Institute of Technology, Zhiqing Qiu Beijing Institute of Technology, Hui Liu Beijing Institute of Technology
11:15 15m Talk		ChatGPT-Based Test Generation for Refactoring Engines Enhanced by Feature Analysis on Examples Research Track Chunhao Dong Beijing Institute of Technology, Yanjie Jiang Peking University, Yuxia Zhang Beijing Institute of Technology, Yang Zhang Hebei University of Science and Technology, Hui Liu Beijing Institute of Technology
11:30 15m Talk		SECRET: Towards Scalable and Efficient Code Retrieval via Segmented Deep Hashing Research Track Wenchao Gu The Chinese University of Hong Kong, Ensheng Shi Xi’an Jiaotong University, Yanlin Wang Sun Yat-sen University, Lun Du Microsoft Research, Shi Han Microsoft Research, Hongyu Zhang Chongqing University, Dongmei Zhang Microsoft Research, Michael Lyu The Chinese University of Hong Kong
11:45 15m Talk		UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation New Ideas and Emerging Results (NIER) Liangying Shao School of Informatics, Xiamen University, China, Yanfu Yan William & Mary, Denys Poshyvanyk William & Mary, Jinsong Su School of Informatics, Xiamen University, China
12:00 15m Talk		How is Google using AI for internal code migrations? SE In Practice (SEIP) Stoyan Nikolov Google, Inc., Daniele Codecasa Google, Inc., Anna Sjovall Google, Inc., Maxim Tabachnyk Google, Siddharth Taneja Google, Inc., Celal Ziftci Google, Satish Chandra Google, Inc
12:15 7m Talk		LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation Journal-first Papers Sarah Fakhoury Microsoft Research, Aaditya Naik University of Pennsylvania, Georgios Sakkas University of California at San Diego, Saikat Chakraborty Microsoft Research, Shuvendu K. Lahiri Microsoft Research Link to publication
12:22 7m Talk		The impact of Concept drift and Data leakage on Log Level Prediction Models Journal-first Papers Youssef Esseddiq Ouatiti Queen's university, Mohammed Sayagh ETS Montreal, University of Quebec, Noureddine Kerzazi Ensias-Rabat, Bram Adams Queen's University, Ahmed E. Hassan Queen’s University, Youssef Esseddiq Ouatiti Queen's university