Improving Code Autocompletion with Transfer Learning
Thu 12 May 2022 03:00 - 03:05 at ICSE room 2-odd hours - Software Engineering in Practice 1 Chair(s): Mary Sánchez-Gordón
Software language models have achieved promising results predicting code completion usages, and several industry studies have described successful IDE integrations. Recently, accuracy in autocompletion prediction improved 12.8% from training on a real-world dataset collected from programmers’ IDE activity. But what if limited examples of IDE autocompletion in the target programming language are available for model training? In this paper, we investigate the efficacy of pretraining autocompletion models on non-IDE, non-autocompletion, and different-language example code sequences. We find that these unsupervised pretrainings improve model accuracy by over 50% on very small fine-tuning datasets and over 10% on 50k labeled examples. We confirm the real-world impact of these pretrainings in an online setting through A/B testing on thousands of IDE autocompletion users, finding that pretraining is responsible for increases of up to 6.63% autocompletion usage.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
22:00 - 23:00 | Software Engineering in Practice 3SEIP - Software Engineering in Practice at ICSE room 2-even hours Chair(s): Nancy Mead Carnegie Mellon Software Engineering Institute | ||
22:00 5mTalk | Automatically Identifying Shared Root Causes of Test Breakages in SAP HANA SEIP - Software Engineering in Practice Gabin An KAIST, Juyeon Yoon Korea Advanced Institute of Science and Technology, Jeongju Sohn University of Luxembourg, Jingun Hong SAP Labs, Dongwon Hwang SAP Labs, Shin Yoo KAIST Pre-print Media Attached | ||
22:05 5mTalk | Record and Replay of Online Traffic for Microservices with Automatic Mocking Point Identification SEIP - Software Engineering in Practice Jiangchao Liu Ant Group, Jierui Liu Ant Group, Peng Di Ant Group, Alex X. Liu Ant Group, Zexin Zhong Ant Group; University of Technology Sydney Pre-print Media Attached | ||
22:10 5mTalk | Field-based Static Taint Analysis for Industrial Microservices SEIP - Software Engineering in Practice Zexin Zhong Ant Group; University of Technology Sydney, Jiangchao Liu Ant Group, Diyu Wu Ant Group, Peng Di Ant Group, Yulei Sui University of Technology Sydney, Alex X. Liu Ant Group Pre-print Media Attached | ||
22:15 5mTalk | A Cross-Company Ethnographic Study on Software Teams for DevOps and Microservices: Organization, Benefits, and Issues SEIP - Software Engineering in Practice Xin Zhou Nanjing University, China, Huang Huang State Grid Nanjing Power Supply Company, He Zhang Nanjing University, Xin Huang , Dong Shao Nanjing University, Chenxing Zhong Nanjing University Pre-print | ||
22:20 5mTalk | An Industrial Experience Report on Retro-inspection SEIP - Software Engineering in Practice Lanxin Yang Nanjing University, He Zhang Nanjing University, Fuli Zhang Nanjing University, Xiaodong Zhang Nanjing University, Guoping Rong Nanjing University DOI Pre-print Media Attached | ||
22:25 5mTalk | Improving Code Autocompletion with Transfer Learning SEIP - Software Engineering in Practice A: Gareth Aye Facebook, Inc., A: Wen Zhou Facebook, A: Vijayaraghavan Murali Meta Platforms, Inc., A: Seohyun Kim Meta Pre-print |
Thu 12 MayDisplayed time zone: Eastern Time (US & Canada) change
03:00 - 04:00 | Software Engineering in Practice 1SEIP - Software Engineering in Practice at ICSE room 2-odd hours Chair(s): Mary Sánchez-Gordón Østfold University College | ||
03:00 5mTalk | Improving Code Autocompletion with Transfer Learning SEIP - Software Engineering in Practice A: Gareth Aye Facebook, Inc., A: Wen Zhou Facebook, A: Vijayaraghavan Murali Meta Platforms, Inc., A: Seohyun Kim Meta Pre-print | ||
03:05 5mTalk | On the Effectiveness of Machine Learning Experiment Management Tools SEIP - Software Engineering in Practice Samuel Idowu Chalmers | University of Gothenburg, Osman Hasan National University of Sciences & Technology, Daniel Strüber Chalmers | University of Gothenburg / Radboud University, Thorsten Berger Pre-print Media Attached | ||
03:10 5mTalk | Looking for Lacunae in Bitcoin Core’s Fuzzing Efforts SEIP - Software Engineering in Practice Alex Groce Northern Arizona University, Kush Jain Carnegie Mellon University, Rijnard van Tonder Sourcegraph, Goutamkumar Tulajappa Kalburgi Northern Arizona University, Claire Le Goues Carnegie Mellon University | ||
03:15 5mTalk | AI for Automated Code Updates SEIP - Software Engineering in Practice Salwa Alamir J.P. Morgan AI Research, Petr Babkin J.P. Morgan AI Research, Nacho Navarro J.P. Morgan AI Research, Sameena Shah J.P. Morgan AI Research Pre-print Media Attached |