Retrieve-Guided Commit Message Generation with Semantic Similarity And Disparity
High quality commit messages are important for program understanding and maintenance, which describes the content of code changes. Neural-based methods are most popular ways to generate commit messages, but they neglect retrieval results including retrieval diffs and retrieval messages. The combination models of neural-based methods and retrieval-based methods have two major limitations: a) Ignore retrieval messages b) Seldom consider similarity and disparity between retrieval results and given diff. To address the above two issues, we propose a retrieve-guided method named ReGenSD to generate commit messages. Specifically, we decompose the commit message generation task into three steps. Firstly, we apply an similarity-based IR technique to get retrieval diff and retrieval messages. Secondly, we introduce a selective mechanism to decide following generation models based on lexical similarity between retrieval diff and given diff. Lastly, we design two kinds of generation models. For simple generation model, we use seq2seq network that only takes given diff as input. For retrieve-guided model, we design a novel seq2seq network with Bi-LSTM that takes given diff, retrieval diff and retrieval message as input. To capture semantic relations between retrieval results and given diff, we introduce a relation gate in encoder to leverage retrieval message adaptively based on semantic similarity, and a difference vector in decoder to refine the utilization of retrieval message based on semantic disparity. Experimental results on an open source dataset demonstrate that retrieval messages guidance can facilitate commit message generation task. Besides, ablation experiments prove the effectiveness of our proposed mechanisms on adjusting the use of retrieval results.
Thu 8 DecDisplayed time zone: Osaka, Sapporo, Tokyo change
| 15:00 - 16:30 | Machine Learning 2Technical Track at Room3 Chair(s): Morakot Choetkiertikul Mahidol University, Thailand | ||
| 15:0020m Paper | Retrieve-Guided Commit Message Generation with Semantic Similarity And Disparity Technical Track Zhihan Li School of Computer Science and Engineering, Central South University, Yi Cheng School of Computer Science and Engineering, Central South University, Haiyang Yang School of Computer Science and Engineering, Central South University, Li Kuang School of Computer Science and Engineering, Central South University, Lingyan Zhang School of Computer Science and Engineering, Central South University | ||
| 15:2020m Paper | Systematic Analysis of Defect Specific Code Abstraction for Neural Program Repair Technical Track Kicheol Kim Sungkyunkwan University, Misoo Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University | ||
| 15:4020m Paper | NEGAR: Network Embedding Guided Architecture Recovery for Software Systems Technical Track Jiayi Chen State Key Lab for Novel Software Technology, Nanjing University, Zhixing Wang State Key Lab for Novel Software Technology, Nanjing University, yuchen jiang , Tian Zhang Nanjing University, Jun Pang University of Luxembourg, Minxue Pan Nanjing University, Nitsan Amit Hebrew University | ||
| 16:0020m Paper | Goal-oriented Knowledge Reuse via Curriculum Evolution for Reinforcement Learning-based Adaptation Technical Track Jialong Li Waseda University, Japan, Mingyue Zhang Peking University, China, Zhenyu Mao Waseda University, Haiyan Zhao Peking University, Zhi Jin Peking University, Shinichi Honiden Waseda University / National Institute of Informatics, Japan, Kenji Tei Waseda University | ||