Comments on Comments: Where Code Review and Documentation Meet (MSR 2022 - Technical Papers)

Who

Nikitha Rao, Jason Tsay, Martin Hirzel, Vincent J. Hellendoorn

Track

MSR 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 17 May 2022 22:04 - 22:08 at MSR Main room - even hours - Session 1 Chair(s): Hongyu Zhang, Masud Rahman
Mon 23 May 2022 14:15 - 14:23 at Room 315+316 - Blended Technical Session 2 (Machine Learning and Information Retrieval) Chair(s): Preetha Chatterjee

Abstract

An important function of code review is to increase understanding; helping reviewers understand a code change aides in knowledge transfer and finding bugs. Comments in code largely serve a similar purpose, helping future readers understand the program. It is thus natural to study what happens when these two forms of understanding collide. We ask: what documentation-related comments do reviewers make and how do they affect understanding of the contribution? We analyze ca.~700K review comments on 2,000 (Java and Python) GitHub projects, and propose several filters to identify which comments are likely to be either in response to a change in documentation and/or a call for such a change. We identify 65K such cases. We next develop a taxonomy of the reviewer intents behind such “comments on comments”. We find that achieving a shared understanding of the code is key: reviewer comments most often focused on clarification, followed by pointing out issues to fix, such as typos and outdated comments. Curiously, clarifying comments were frequently suggested (often verbatim) by the reviewer, indicating a desire to persist their understanding acquired during code review. We conclude with a discussion of implications of our comments-on-comments dataset for research on improving code review, including the potential benefits for automating code review.

Link to Preprint

https://arxiv.org/abs/2204.00107

DOI

https://doi.org/10.1145/3524842.3528475

File attachments

Comments On Comments: Where Code Review and Documentation Meet (296-Tech-Rao.mp4)	6.92MiB

Nikitha Rao

Carnegie Mellon University

United States

Jason Tsay

IBM Research

United States

Martin Hirzel

IBM Research

United States

Vincent J. Hellendoorn

Carnegie Mellon University

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 17 May
Displayed time zone: Eastern Time (US & Canada) change

22:00 - 22:50	Session 1Technical Papers / Registered Reports at MSR Main room - even hours Chair(s): Hongyu Zhang University of Newcastle, Masud Rahman Dalhousie University

22:00 4m Short-paper		An Empirical Evaluation of GitHub Copilot’s Code Suggestions Technical Papers Nhan Nguyen University of Alberta, Sarah Nadi University of Alberta DOI Pre-print
22:04 4m Short-paper		Comments on Comments: Where Code Review and Documentation Meet Technical Papers Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University DOI Pre-print File Attached
22:08 7m Talk		Does This Apply to Me? An Empirical Study of Technical Context in Stack Overflow Technical Papers Akalanka Galappaththi University of Alberta, Sarah Nadi University of Alberta, Christoph Treude University of Melbourne DOI Pre-print Media Attached
22:15 7m Talk		Towards Reliable Agile Iterative Planning via Predicting Documentation Changes of Work Items Technical Papers Jirat Pasuksmit University of Melbourne, Patanamon Thongtanunam University of Melbourne, Shanika Karunasekera The University of Melbourne
22:22 7m Talk		BotHunter: An Approach to Detect Software Bots in GitHub Technical Papers Ahmad Abdellatif Concordia University, Mairieli Wessel Delft University of Technology, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University, USA, Emad Shihab Concordia University Pre-print
22:29 7m Talk		Recommending Code Improvements Based on Stack Overflow Answer Edits Registered Reports Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Matheus Paixao University of Fortaleza Pre-print
22:36 14m Live Q&A		Discussions and Q&A Technical Papers

Mon 23 May
Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:00	Blended Technical Session 2 (Machine Learning and Information Retrieval) Technical Papers / Data and Tool Showcase Track at Room 315+316 Chair(s): Preetha Chatterjee Drexel University, USA

13:30 15m Talk		Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health) Technical Papers Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University Pre-print Media Attached
13:45 15m Talk		GraphCode2Vec: Generic Code Embedding via Lexical and Program Dependence Analyses Technical Papers Wei Ma SnT, University of Luxembourg, Mengjie Zhao LMU Munich, Ezekiel Soremekun SnT, University of Luxembourg, Qiang Hu University of Luxembourg, Jie M. Zhang King's College London, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Xiaofei Xie Singapore Management University, Singapore, Yves Le Traon University of Luxembourg, Luxembourg Pre-print
14:00 15m Talk		Senatus: A Fast and Accurate Code-to-Code Recommendation Engine Technical Papers Fran Silavong JP Morgan Chase & Co., Sean Moran JP Morgan Chase & Co., Antonios Georgiadis JP Morgan Chase & Co., Rohan Saphal JP Morgan Chase & Co., Robert Otter JP Morgan Chase & Co. DOI Pre-print Media Attached
14:15 8m Short-paper		Comments on Comments: Where Code Review and Documentation Meet Technical Papers Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University DOI Pre-print File Attached
14:23 8m Short-paper		On the Naturalness of Fuzzer Generated Code Technical Papers Rajeswari Hita Kambhamettu Carnegie Mellon University, John Billos Wake Forest University, Carolyn "Tomi" Oluwaseun-Apo Pennsylvania State University, Benjamin Gafford Carnegie Mellon University, Rohan Padhye Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
14:31 8m Talk		SOSum: A Dataset of Stack Overflow Post Summaries Data and Tool Showcase Track Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:39 21m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Tue 17 May 2022 22:00 - 22:50 at MSR Main room - even hours - Session 1 Chair(s): Hongyu Zhang, Masud Rahman

Info for room MSR Main room - even hours:

Click here to go to the room on Midspace