Keynote: Interpreting Neural Models of Code by Comparing Them with Human Developers
Neural models of code are successfully tackling various prediction tasks, complementing and sometimes even outperforming traditional program analyses. While most work focuses on end-to-end evaluations of such models, it often remains unclear what the models actually learn, and to what extent their reasoning about code matches that of skilled humans. A poor understanding of the model reasoning risks deploying models that are right for the wrong reason, and taking decisions based on spurious correlations in the training dataset. This talk presents our attempts of understanding and interpreting neural models of code by directly comparing them with human developers. To this end, we present a methodology for recording human attention and use it to gather thousands of human attention maps for two tasks: code summarization and automated program repair. Comparing the human attention to four attention-based neural models shows that the attention of these models resembles the way humans reason about code. At the same time, there are important differences: For example, some models ignore kinds of tokens, such as strings, that are deemed important by the humans, and other models miss most of the code around a line of interest. The results also show that human-model agreement positively correlates with accurate predictions by a model, which calls for neural models that even more closely mimic human reasoning. Beyond the insights from our study, we envision the release of our dataset of human attention maps to help understand future neural models of code and to foster work on human-inspired models.
Michael Pradel is a full professor at the University of Stuttgart, which he joined after a PhD at ETH Zurich, a post-doc at UC Berkeley, an assistant professorship at TU Darmstadt, and a sabbatical at Facebook. His research interests span software engineering, programming languages, security, and machine learning, with a focus on tools and techniques for building reliable, efficient, and secure software. In particular, he is interested in neural software analysis, analyzing web applications, dynamic analysis, and test generation. Michael has been recognized through the Ernst-Denert Software Engineering Award, an Emmy Noether grant by the German Research Foundation (DFG), an ERC Starting Grant, four best/distinguished paper awards, and by being named an ACM Distinguished Member.
Sun 14 MayDisplayed time zone: Hobart change
13:45 - 15:15 | Session 3InteNSE at Meeting Room 110 Chair(s): Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign | ||
13:45 60mKeynote | Keynote: Interpreting Neural Models of Code by Comparing Them with Human Developers InteNSE Michael Pradel University of Stuttgart | ||
14:45 30mResearch paper | Probing Numeracy and Logic of Language Models of Code InteNSE Razan Baltaji University of Illinois at Urbana Champaign, Parth Thakkar University of Illinois at Urbana Champaign |