A Human Study of Comprehension and Code Summarization (ICPC 2020 - Research)

Who

Sean Stapleton, Yashmeet Gambhir, Alexander LeClair, Zachary Eberhart, Westley Weimer, Kevin Leach, Yu Huang

Track

ICPC 2020 Research

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 14 Jul 2020 02:00 - 02:15 at ICPC - Session 4: Summalization Chair(s): Venera Arnaoudova

Abstract

Software developers spend a great deal of time reading and understanding code that is poorly-documented, written by other developers, or developed using differing styles. During the past decade, researchers have investigated techniques for automatically documenting code to improve comprehensibility. In particular, recent advances in deep learning have led to sophisticated summary generation techniques that convert functions or methods to simple English strings that succinctly describe that code’s behavior. However, automatic summarization techniques are assessed using internal metrics such as BLEU scores, which measure natural language properties in translational models, or ROUGE scores, which measure overlap with human-written text. Unfortunately, these metrics do not necessarily capture how machine-generated code summaries actually affect human comprehension or developer productivity. We conducted a human study involving both university students and professional developers (n = 45). Participants reviewed Java methods and summaries and answered established program comprehension questions. In addition, participants completed coding tasks given summaries as specifications. Critically, the experiment controlled the source of the summaries: for a given method, some participants were shown human-written text and some were shown machine-generated text. We found that participants performed significantly better (p = 0.029) using human-written summaries versus machine-generated summaries. However, we found no evidence to support that participants perceive human- and machine-generated summaries to have different qualities. In addition, participants’ performance showed no correlation with the BLEU and ROUGE scores often used to assess the quality of machine-generated summaries. These results suggest a need for revised metrics to assess and guide automatic summarization techniques.

Link to Preprint

https://dijkstra.eecs.umich.edu/kleach/icpc2020-code-summarization.pdf

Sean Stapleton

University of Michigan

United States

Yashmeet Gambhir

University of Michigan

Alexander LeClair

University Of Notre Dame

United States

Zachary Eberhart

Westley Weimer

University of Michigan, USA

Kevin Leach

University of Michigan

United States

Yu Huang

University of Michigan

Media

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 14 Jul
Displayed time zone: (UTC) Coordinated Universal Time change

01:30 - 02:30	Session 4: SummalizationResearch / ERA at ICPC Chair(s): Venera Arnaoudova Washington State University

01:30 15m Paper		Improved Code Summarization via a Graph Neural Network Research Alexander LeClair University Of Notre Dame, Sakib Haque University of Notre Dame, Lingfei Wu IBM Research, Collin McMillan University of Notre Dame Pre-print Media Attached
01:45 15m Paper		BugSum: Deep Context Understanding for Bug Report Summarization Research Haoran Liu National University of Defense Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Shanshan Li National University of Defense Technology, Yong Guo National University of Defense Technology, Deze Wang National University of Defense Technology, Xiaoguang Mao National University of Defense Technology Media Attached
02:00 15m Paper		A Human Study of Comprehension and Code Summarization Research Sean Stapleton University of Michigan, Yashmeet Gambhir University of Michigan, Alexander LeClair University Of Notre Dame, Zachary Eberhart , Westley Weimer University of Michigan, USA, Kevin Leach University of Michigan, Yu Huang University of Michigan Pre-print Media Attached
02:15 15m Paper		Linguistic Documentation of Software History ERA Miroslav Tushev Louisiana State University, Nash Mahmoud Louisiana State University Media Attached