Code Prediction by Feeding Trees to TransformersTechnical Track
This program is tentative and subject to change.
Code prediction, more specifically autocomplete, has become an essential feature in modern IDEs. Autocomplete is more effective when the desired next token is at (or close to) the top of the list of potential completions offered by the IDE at cursor position. This is where the strength of the underlying machine learning system that produces a ranked order of potential completions comes into play.
We advance the state-of-the-art in the accuracy of code prediction (next token prediction) used in autocomplete systems. Our work uses Transformers as the base neural architecture. We show that by making the Transformer architecture aware of the syntactic structure of code, we increase the margin by which a Transformer-based system outperforms previous systems. With this, it outperforms the accuracy of several state-of-the-art next token prediction systems by margins ranging from 14% to 18%.
We present in the paper several ways of communicating the code structure to the Transformer, which is fundamentally built for processing sequence data. We provide a comprehensive experimental evaluation of our proposal, along with alternative design choices, on a standard Python dataset, as well as on a company internal Python corpus. Our code and data preparation pipeline will be available in open source.
This program is tentative and subject to change.
Wed 26 May Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
18:50 - 19:50 | 2.5.3. Code CompletionTechnical Track / SEIP - Software Engineering in Practice at Blended Sessions Room 3 Chair(s): Marsha ChechikUniversity of Toronto | ||
18:50 20mPaper | Siri, Write the Next MethodTechnical Track Technical Track Fengcai WenSoftware Institute, USI Università della Svizzera italiana, Emad AghajaniSoftware Institute, USI Università della Svizzera italiana, Csaba NagySoftware Institute, USI Università della Svizzera italiana, Michele LanzaSoftware Institute, USI Università della Svizzera italiana, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana Pre-print | ||
19:10 20mPaper | Code Prediction by Feeding Trees to TransformersTechnical Track Technical Track Seohyun KimFacebook, Jinman ZhaoUniversity of Wisconsin-Madison, USA, Yuchi TianColumbia University, Satish ChandraFacebook, USA Pre-print | ||
19:30 20mPaper | Learning Autocompletion from Real-World DatasetsSEIP SEIP - Software Engineering in Practice Pre-print |