Context-aware Retrieval-based Deep Commit Message Generation
Thu 12 May 2022 04:05 - 04:10 at ICSE room 4-even hours - Evolution and Maintenance 2 Chair(s): Sigrid Eldh
Commit messages recorded in version control systems contain valuable information for software development, maintenance, and comprehension. Unfortunately, developers often commit code with empty or poor quality commit messages. To address this issue, several studies have proposed approaches to generate commit messages from commit {\it diffs}. Recent studies make use of neural machine translation algorithms to try and translate git {\it diffs} into commit messages and have achieved some promising results. However, these learning-based methods tend to generate high-frequency words but ignore low-frequency ones. In addition, they suffer from exposure bias issues, which leads to a gap between training phase and testing phase.
In this paper, we propose {\sc CoRec} to address the above two limitations. Specifically, we first train a context-aware encoder-decoder model which randomly selects the previous output of the decoder or the embedding vector of a ground truth word as context to make the model gradually aware of previous alignment choices. Given a {\it diff} for testing, the trained model is reused to retrieve the most similar {\it diff} from the training set. Finally, we use the retrieval {\it diff} to guide the probability distribution for the final generated vocabulary. Our method combines the advantages of both information retrieval and neural machine translation. We evaluate {\sc CoRec} on a dataset from Liu et al. and a large-scale dataset crawled from 10k popular Java repositories in Github. Our experimental results show that {\sc CoRec} significantly outperforms the state-of-the-art method NNGen by 19% on average in terms of BLEU.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:00 | Evolution and Maintenance 4NIER - New Ideas and Emerging Results / Journal-First Papers / Technical Track at ICSE room 1-odd hours Chair(s): Sarah Nadi University of Alberta | ||
11:00 5mTalk | Maintenance-Related Concerns for Post-deployed Ethereum Smart Contract Development: Issues, Techniques, and Future Challenges Journal-First Papers Jiachi Chen Sun Yat-Sen University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, John Grundy Monash University, Xiaohu Yang Zhejiang University Link to publication DOI Pre-print Media Attached | ||
11:05 5mTalk | Context-aware Retrieval-based Deep Commit Message Generation Journal-First Papers Haoye Wang Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Qiang He Swinburne University of Technology, Xinyu Wang Zhejiang University, John Grundy Monash University Link to publication DOI Pre-print Media Attached | ||
11:10 5mTalk | Self-Admitted Technical Debt Practices: A Comparison Between Industry and Open-Source Journal-First Papers Fiorella Zampetti University of Sannio, Italy, Gianmarco Fucci University of Sannio, Alexander Serebrenik Eindhoven University of Technology, Massimiliano Di Penta University of Sannio, Italy Link to publication DOI Pre-print Media Attached | ||
11:15 5mTalk | BreakBot: Analyzing the Impact of Breaking Changes to Assist Library EvolutionNIER-track Award NIER - New Ideas and Emerging Results Lina Ochoa Eindhoven University of Technology, Thomas Degueule CNRS, LaBRI, Jean-Rémy Falleri Univ. Bordeaux, Bordeaux INP, CNRS, LaBRI. Institut Universitaire de France. Pre-print Media Attached | ||
11:20 5mTalk | Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques Technical Track Zanis Ali Khan University of Luxembourg, Donghwan Shin University of Luxembourg, Domenico Bianculli University of Luxembourg, Lionel Briand University of Luxembourg; University of Ottawa Pre-print Media Attached | ||
11:25 5mTalk | Automated Patching for Unreproducible Builds Technical Track Zhilei Ren Dalian University of Technology, Shiwei Sun Dalian University of Technology, Jifeng Xuan Wuhan University, Xiaochen Li University of Luxembourg, Zhide Zhou Dalian University of Technology, He Jiang School of Software, Dalian University of Technology Pre-print Media Attached |
Thu 12 MayDisplayed time zone: Eastern Time (US & Canada) change
04:00 - 05:00 | Evolution and Maintenance 2Technical Track / Journal-First Papers at ICSE room 4-even hours Chair(s): Sigrid Eldh Ericsson AB, Mälardalen University, Carleton Unviersity | ||
04:00 5mTalk | Maintenance-Related Concerns for Post-deployed Ethereum Smart Contract Development: Issues, Techniques, and Future Challenges Journal-First Papers Jiachi Chen Sun Yat-Sen University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, John Grundy Monash University, Xiaohu Yang Zhejiang University Link to publication DOI Pre-print Media Attached | ||
04:05 5mTalk | Context-aware Retrieval-based Deep Commit Message Generation Journal-First Papers Haoye Wang Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Qiang He Swinburne University of Technology, Xinyu Wang Zhejiang University, John Grundy Monash University Link to publication DOI Pre-print Media Attached | ||
04:10 5mTalk | Recommending Good First Issues in GitHub OSS Projects Technical Track Wenxin Xiao School of Computer Science, Peking University, Hao He Peking University, Weiwei Xu School of Computer Science and Technology, Soochow University, Xin Tan Beihang University, China, Jinhao Dong Peking University, Minghui Zhou Peking University, China Pre-print Media Attached | ||
04:15 5mTalk | Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques Technical Track Zanis Ali Khan University of Luxembourg, Donghwan Shin University of Luxembourg, Domenico Bianculli University of Luxembourg, Lionel Briand University of Luxembourg; University of Ottawa Pre-print Media Attached | ||
04:20 5mTalk | Automated Patching for Unreproducible Builds Technical Track Zhilei Ren Dalian University of Technology, Shiwei Sun Dalian University of Technology, Jifeng Xuan Wuhan University, Xiaochen Li University of Luxembourg, Zhide Zhou Dalian University of Technology, He Jiang School of Software, Dalian University of Technology Pre-print Media Attached | ||
04:25 5mTalk | Using Pre-Trained Models to Boost Code Review Automation Technical Track Rosalia Tufano Università della Svizzera Italiana, Simone Masiero Software Institute @ Università della Svizzera Italiana, Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Denys Poshyvanyk William and Mary, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached |