Human Inspection of Code at Scale: The Value of Variation in Informing Decision-Making
This program is tentative and subject to change.
There’s a lot we can learn from the code that already exists, and even the generated code that is remixed and suggested back to us by large language models (LLMs). Dense code corpora, i.e., many examples which use the same library or perform the same or similar functions, can be trivially created via code search or LLM. As predicted by theories of human concept learning, i.e., Variation Theory and Analogical Learning Theory, exposing the analogical relationships across these examples—and the variation within those analogical relationships—can help humans inspect, learn from, and make decisions based on these code corpora. In this talk I’ll review several systems in which this has been demonstrated.
I design, build and evaluate systems for comprehending and interacting with population-level structure and trends in large code and data corpora. I am currently an Assistant Professor of Computer Science at the Harvard Paulson School of Engineering & Applied Sciences, specializing in Human-Computer Interaction. From 2018-22, I was the Stanley A. Marks & William H. Marks Professor at the Radcliffe Institute for Advanced Study, and more recently I was named as a 2023 Sloan Research Fellow. At MIT, I earned a PhD and MEng in Electrical Engineering and Computer Science and a BS in Electrical Science and Engineering. Before joining Harvard, I was a postdoctoral scholar in Electrical Engineering and Computer Science at the University of California, Berkeley, where I received the Berkeley Institute for Data Science Moore/Sloan Data Science Fellowship.