The Naturalness of Software, and the roots and branches thereof
Most Influential Paper from ICSE 2012 keynote
Our work, 10 years ago, began with spirited debates over Gabel & Su’s inspired discovery of the non-uniqueness of software. Our discussions led to the finding that statistical language models (from NLP) were amazingly effective at modeling and predicting code; our work, together with the explosive proliferation of deep learning, has led to a spate of applications, culminating in the recent advent of the GPT-3/Codex language model in Visual Studio, which most of us know as GitHub Copilot. Copilot has fulfilled much of the disruptive potential that we actually wrote about, 10 years ago, in our ICSE submission. Since then, our work has explored the question of Why Code is so Natural; this led us to posit the dual-channel model of code, where one channel is noisy and the other formal. This model promises new ways of training models for SE tasks, as well as new ways of thinking about program understanding, code reading, and teaching programming.
Fri 13 MayDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 09:30
|The Naturalness of Software, and the roots and branches thereofMost Influential Paper from ICSE 2012 keynote|