Deep Learning-based Code Reviews: A Paradigm Shift or a Double-Edged Sword?
This program is tentative and subject to change.
Several techniques have been proposed to (partially) automate code review. Early support consisted in recommending the most suited reviewer for a given change or in prioritizing the review tasks. With the advent of deep learning in software engineering, the level of automation has been pushed to new heights, with approaches able to provide feedback on source code in natural language as a human reviewer would do. Also, recent work documented open source projects adopting Large Language Models (LLMs) as co-reviewers. Although the research in this field is very active, little is known about the actual impact of including automatically generated code reviews in the code review process. While there are many aspects worth investigating (e.g., is knowledge transfer between developers affected?), in this work we focus on three of them: (i) review quality, i.e., the reviewer’s ability to identify issues in the code; (ii) review cost, i.e., the time spent reviewing the code; and (iii) reviewer’s confidence, i.e., how confident is the reviewer about the provided feedback. We run a controlled experiment with 29 professional developers who reviewed different programs with/without the support of an automatically generated code review. During the experiment we monitored the reviewers’ activities, for over 50 hours of recorded code reviews. We show that reviewers consider valid most of the issues automatically identified by the LLM and that the availability of an automated review as a starting point strongly influences their behavior: Reviewers tend to focus on the code locations indicated by the LLM rather than searching for additional issues in other parts of the code. The reviewers who started from an automated review identified a higher number of low-severity issues while, however, not identifying more high- severity issues as compared to a completely manual process. Finally, the automated support did not result in saved time and did not increase the reviewers’ confidence.
This program is tentative and subject to change.
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 15mTalk | Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers Research Track Yuling Shi Shanghai Jiao Tong University, Hongyu Zhang Chongqing University, Chengcheng Wan East China Normal University, Xiaodong Gu Shanghai Jiao Tong University | ||
14:15 15mTalk | Deep Learning-based Code Reviews: A Paradigm Shift or a Double-Edged Sword? Research Track Rosalia Tufano Università della Svizzera Italiana, Alberto Martin-Lopez Software Institute - USI, Lugano, Ahmad Tayeb , Ozren Dabic Software Institute, Università della Svizzera italiana (USI), Switzerland, Sonia Haiduc , Gabriele Bavota Software Institute @ Università della Svizzera Italiana | ||
14:30 15mTalk | An Exploratory Study of ML Sketches and Visual Code Assistants Research Track Luis F. Gomes Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University, Jonathan Aldrich Carnegie Mellon University, Rui Abreu INESC-ID; University of Porto | ||
14:45 15mTalk | LiCoEval: Evaluating LLMs on License Compliance in Code Generation Research Track Weiwei Xu Peking University, Kai Gao Peking University, Hao He Carnegie Mellon University, Minghui Zhou Peking University Pre-print | ||
15:00 15mTalk | Trust Dynamics in AI-Assisted Development: Definitions, Factors, and Implications Research Track Sadra Sabouri University of Southern California, Philipp Eibl University of Southern California, Xinyi Zhou University of Southern California, Morteza Ziyadi Amazon AGI, Nenad Medvidović University of Southern California, Lars Lindemann University of Southern California, Souti Chattopadhyay University of Southern California Pre-print | ||
15:15 15mTalk | What Guides Our Choices? Modeling Developers' Trust and Behavioral Intentions Towards GenAI Research Track Rudrajit Choudhuri Oregon State University, Bianca Trinkenreich Colorado State University, Rahul Pandita GitHub, Inc., Eirini Kalliamvakou GitHub, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University, Christopher Sanchez Oregon State University, Anita Sarma Oregon State University Pre-print |