Predictive Comment Updating with Heuristics and AST-Path-Based Neural Learning: A Two-Phase Approach (ICSE 2023 - Journal-First Papers)

Who

Bo Lin, Shangwen Wang, Zhongxin Liu, Xin Xia, Xiaoguang Mao

Track

ICSE 2023 Journal-First Papers

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 17 May 2023 17:00 - 17:07 at Level G - Plenary Room 1 - Documentation Chair(s): Denys Poshyvanyk

Abstract

Just-in-time comment update is a promising way to reduce the burden of developers during software maintenance and evolution. Existing approaches can be divided into two categories: the heuristic-based approach and the deep-learning-based approach. The heuristic-based approach is restricted to a specific type of comment updates (i.e., code-indicative updates), but performs well on such type. The effectiveness of deep-learning-based approach is limited but it can handle diverse comment updates. Considering the complementary advantages of existing approaches, an intuitive idea is to combine them for better performance. To investigate this idea, we first conduct a pre-study experiment which shows that to construct an effective comment updater by combining heuristic-based and deep-learning-based approaches, we need to tackle two main challenges: 1) the heuristic-based approach may bring side effects to cases which cannot be updated by it; and 2) the current deep-learning-based approach is with limited effectiveness. Then, we propose a novel two-phase approach named Toper to cope with these two challenges and effectively perform comment updates. In the first phase, Toper integrates nine distinctive features identified through our large-scale empirical analysis into a predictive model, which can predict whether the contents of the comment updates can be found in the corresponding code changes, namely, the comment updates are code-indicative updates. If so, the updates are then generated by an off-the-shelf heuristic-based approach; otherwise, Toper leverages a deep learning model, which we specially designed for non-code-indicative updates, to infer the new comment based on the old comment and code change. Motivated by our manual observation on the limitation of existing approaches on non-code-indicative updates, our deep learning model adopts the Abstract Syntax Tree path technique, which can capture the program structure information for effectively embedding code changes. Our evaluation shows that our approach outperforms the state-of-the-art by around 20% with respect to the number of correct comments it generates. Via in-depth analysis, we illustrate the rationale of each design decision as well as point out potential directions.

Link to Publication

https://ieeexplore.ieee.org/document/9808190

Link to Preprint

https://shangwenwang.github.io/files/TSE-22.pdf

DOI

https://doi.org/10.1109/TSE.2022.3185458

Bo Lin

National University of Defense Technology

China

Shangwen Wang

National University of Defense Technology

China

Zhongxin Liu

Zhejiang University

China

Xin Xia

Huawei

China

Xiaoguang Mao

National University of Defense Technology

China

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 17 May
Displayed time zone: Hobart change

15:45 - 17:15	DocumentationTechnical Track / Journal-First Papers at Level G - Plenary Room 1 Chair(s): Denys Poshyvanyk College of William and Mary

15:45 15m Talk		Developer-Intent Driven Code Comment Generation Technical Track Fangwen Mu Institute of Software Chinese Academy of Sciences, Xiao Chen Institute of Software Chinese Academy of Sciences, Lin Shi ISCAS, Song Wang York University, Qing Wang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences Pre-print
16:00 15m Talk		Data Quality Matters: A Case Study of ObsoleteComment Detection Technical Track Shengbin Xu Nanjing University, Yuan Yao Nanjing University, Feng Xu Nanjing University, Tianxiao Gu TikTok Inc., Jingwei Xu , Xiaoxing Ma Nanjing University Pre-print
16:15 15m Talk		Revisiting Learning-based Commit Message Generation Technical Track Jinhao Dong Peking University, Yiling Lou Fudan University, Dan Hao Peking University, Lin Tan Purdue University Pre-print
16:30 15m Talk		Commit Message Matters: Investigating Impact and Evolution of Commit Message Quality Technical Track Jiawei Li University of California, Irvine, Iftekhar Ahmed University of California at Irvine
16:45 7m Talk		On the Significance of Category Prediction for Code-Comment Synchronization Journal-First Papers Zhen Yang City University of Hong Kong, China, Jacky Keung City University of Hong Kong, Xiao Yu Wuhan University of Technology, Yan Xiao National University of Singapore, Zhi Jin Peking University, Jingyu Zhang City University of Hong Kong
16:52 7m Talk		Correlating Automated and Human Evaluation of Code Documentation Generation Quality Journal-First Papers Xing Hu Zhejiang University, Qiuyuan Chen Zhejiang University, Haoye Wang Hangzhou City University, Xin Xia Huawei, David Lo Singapore Management University, Thomas Zimmermann Microsoft Research
17:00 7m Talk		Predictive Comment Updating with Heuristics and AST-Path-Based Neural Learning: A Two-Phase Approach Journal-First Papers Bo Lin National University of Defense Technology, Shangwen Wang National University of Defense Technology, Zhongxin Liu Zhejiang University, Xin Xia Huawei, Xiaoguang Mao National University of Defense Technology Link to publication DOI Pre-print