Predicting Change Propagation between Code Clone Instances by Graph-based Deep Learning
Code clones widely exist in open-source and industrial software projects. Thanks to the advances in clone detection techniques, it is not hard for developers to be aware of the existence of code clones. However, code clones are still recognized as a threat to software maintenance due to the additional effort required for the simultaneous maintenance of multiple clone instances and potential defects caused by inconsistent changes in clone instances. To alleviate the threat, it is essential to accurately and efficiently make the decisions of change propagation between clone instances. To understand the problem, we conduct an exploratory study on clone change propagation with five famous open-source projects. A key finding of the study is that a clone class can have both propagation-required changes and propagation-free changes and thus fine-grained change propagation decision is required to determine whether a change made to a clone instance needs to be propagated to the other instances of the same clone class. Based on the findings, we propose a graph-based deep learning approach to predict the change propagation requirements of clone instances. We develop a graph representation, named Fused Clone Program Dependency Graph (FC-PDG), to capture the textual and structural code contexts of a pair of clone instances along with the change on one of them. Based on the representation, we design a deep learning model that uses a Relational Graph Convolutional Network (R-GCN) to predict the propagation requirement of a code change of a clone instance. We evaluate the approach with a dataset constructed based on 51 open-source Java projects, which includes 24,672 pairs of matched changes and 38,041 non-matched changes. The results show that the approach achieves a high precision (83.1%), recall (81.2%), and F1-score (82.1%). Our further evaluation with three other open-source projects confirms the generalizability of the trained clone change propagation prediction model.
Mon 16 MayDisplayed time zone: Eastern Time (US & Canada) change
22:00 - 22:50 | Session 10: Code ClonesResearch / Early Research Achievements (ERA) at ICPC room Chair(s): Chaiyong Ragkhitwetsagul Mahidol University, Thailand | ||
22:00 7mTalk | C4: Contrastive Cross-Language Code Clone Detection Research Chenning Tao Zhejiang University, Qi Zhan Zhejiang University, Xing Hu Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab DOI Pre-print Media Attached | ||
22:07 7mTalk | Predicting Change Propagation between Code Clone Instances by Graph-based Deep Learning Research Bin Hu Fudan University, Yijian Wu Fudan University, Xin Peng Fudan University, Chaofeng Sha Fudan University, Xiaocheng Wang Fudan University, Baiqiang Fu Fudan University, Wenyun Zhao Fudan University, China Media Attached File Attached | ||
22:14 4mTalk | An Exploratory Study of Analyzing JavaScript Online Code Clones Early Research Achievements (ERA) DOI Pre-print Media Attached | ||
22:18 7mTalk | Exploring and Understanding Cross-service Code Clones in Microservice Projects Research Yang Zhao Central China Normal University, Ran Mo Central China Normal University, Yao Zhang Central China Normal University, Siyuan Zhang Central China Normal University, Pu Xiong Central China Normal University Media Attached | ||
22:25 7mTalk | MSCCD: Grammar Pluggable Clone Detection Based on ANTLR Parser Generation Research Wenqing ZHU Nagoya University, Norihiro Yoshida Ritsumeikan University, Toshihiro Kamiya Shimane University, Eunjong Choi Kyoto Institute of Technology, Hiroaki Takada Nagoya University Pre-print Media Attached | ||
22:32 7mTalk | Algorithm Identification in Programming Assignments Research Pranshu Chourasia Indian Institute of technology - Bombay, Ganesh Ramakrishnan Indian Institute of technology - Bombay, Varsha Apte Indian Institute of technology - Bombay, Suraj Kumar Indian Institute of technology - Bombay Media Attached | ||
22:39 11mLive Q&A | Q&A-Paper Session 10 Research |