Interpretation-based Code Summarization (ICPC 2023 - Research)

Who

Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Shaomeng Cao, Kechi Zhang, Zhi Jin

Track

ICPC 2023 Research

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 15 May 2023 16:03 - 16:12 at Meeting Room 106 - Code Summarization and Visualization Chair(s): Banani Roy, Akhila Sri Manasa Venigalla

Abstract

Code comment, i.e., the natural language text to describe the semantic of a code snippet, is an important way for developers to comprehend the code. Recently, a number of approaches have been proposed to automatically generate the comment given a code snippet, aiming at facilitating the comprehension activities of developers. Despite that state-of-the-art approaches have already utilized advanced machine learning techniques such as the Transformer model, they often ignore critical information of the source code, leading to the inaccuracy of the generated summarization. In this paper, to boost the effectiveness of code summarization, we propose a two-stage paradigm, where in the first stage, we train an off-the-shelf model and then identify its focuses when generating the initial summarization, through a model interpretation approach, and in the second stage, we reinforce the model to generate more qualified summarization based on the source code and its focuses. Our intuition is that in such a manner the model could learn to identify what critical information in the code has been captured and what has been missed in its initial summarization, and thus revise its initial summarization accordingly, just like how a human student learns to write high-quality summarization for a natural language text. Extensive experiments on two large-scale datasets show that our approach can boost the effectiveness of five state-of-the-art code summarization approaches significantly. Specifically, for the well-known code summarizer, DeepCom, utilizing our two-stage paradigm can increase its BLEU-4 values by around 30% and 25% on the two datasets, respectively.

Link to Preprint

https://shangwenwang.github.io/files/ICPC-23A.pdf

Mingyang Geng

National University of Defense Technology

Shangwen Wang

National University of Defense Technology

China

Dezun Dong

NUDT

Haotian Wang

National University of Defense Technolog

Shaomeng Cao

Peng Cheng Laboratory

Kechi Zhang

Peking University, China

China

Zhi Jin

Peking University

China

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 15 May
Displayed time zone: Hobart change

15:45 - 17:15	Code Summarization and VisualizationReplications and Negative Results (RENE) / Discussion / Research at Meeting Room 106 Chair(s): Banani Roy University of Saskatchewan, Akhila Sri Manasa Venigalla IIT Tirupati

15:45 9m Full-paper		An Extensive Study of the Structure Features in Transformer-based Code Semantic Summarization Research Kang Yang , Xinjun Mao National University of Defense Technology, Shangwen Wang National University of Defense Technology, Yihao Qin National University of Defense Technology, Yao Lu National University of Defense Technology, Tanghaoran Zhang , Kamal Al-Sabahi University Of Technology and Applied Sciences-ibra Pre-print
15:54 9m Full-paper		Label Smoothing Improves Neural Source Code Summarization Research Sakib Haque University of Notre Dame, Aakash Bansal University of Notre Dame, Collin McMillan University of Notre Dame Pre-print
16:03 9m Full-paper		Interpretation-based Code Summarization Research Mingyang Geng National University of Defense Technology, Shangwen Wang National University of Defense Technology, Dezun Dong NUDT, Haotian Wang National University of Defense Technolog, Shaomeng Cao Peng Cheng Laboratory, Kechi Zhang Peking University, China, Zhi Jin Peking University Pre-print
16:12 9m Full-paper		Naturalness in Source Code Summarization. How Significant is it? Replications and Negative Results (RENE) Claudio Ferretti University of Milano-Bicocca, Martina Saletta University of Milano-Bicocca
16:21 9m Full-paper		Comparing 2D and Augmented Reality Visualizations for Microservice System Understandability: A Controlled Experiment Research Amr Elsayed Baylor University, Tomas Cerny Baylor University, Davide Taibi Tampere University , Sira Vegas Universidad Politecnica de Madrid DOI Pre-print
16:30 9m Full-paper		ChameleonIDE: Untangling Type Errors Through Interactive Visualization and Exploration Research Shuai Fu Monash University, Tim Dwyer Monash University, Peter J. Stuckey Monash University, Jackson Wain Monash University, Jesse Linossier Monash University Pre-print
16:39 36m Panel		Discussion 4 Discussion