C4: Contrastive Cross-Language Code Clone Detection
During software development, developers introduce code clones by reusing existing code to improve programming productivity. Considering the detrimental effects on software maintenance and evolution, many techniques are proposed to detect code clones. Existing approaches are mainly used to detect clones written in the same programming language. However, it is common to develop programs with the same functionality but in different programming languages to support various platforms. In this paper, we propose a new approach named C4, referring to $\underline{\textbf{C}}$ontrastive $\underline{\textbf{C}}$ross-language $\underline{\textbf{C}}$ode $\underline{\textbf{C}}$lone detection model. It can detect cross-language clones with learned representations effectively. C4 exploits the pre-trained model CodeBERT to convert programs in different languages into representations. In addition, we fine tune the C4 model through a constrastive learning objective that can effectively recognize clone pairs and non-clone pairs. To evaluate the effectiveness of our approach, we conduct extensive experiments on the dataset proposed by CLCDSA. Experimental results show that C4 achieves scores of 0.94, 0.90, and 0.92 in terms of precision, recall and F-measure and substantially outperforms the state-of-the-art baselines.
Mon 16 MayDisplayed time zone: Eastern Time (US & Canada) change
22:00 - 22:50 | Session 10: Code ClonesResearch / Early Research Achievements (ERA) at ICPC room Chair(s): Chaiyong Ragkhitwetsagul Mahidol University, Thailand | ||
22:00 7mTalk | C4: Contrastive Cross-Language Code Clone Detection Research Chenning Tao Zhejiang University, Qi Zhan Zhejiang University, Xing Hu Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab DOI Pre-print Media Attached | ||
22:07 7mTalk | Predicting Change Propagation between Code Clone Instances by Graph-based Deep Learning Research Bin Hu Fudan University, Yijian Wu Fudan University, Xin Peng Fudan University, Chaofeng Sha Fudan University, Xiaocheng Wang Fudan University, Baiqiang Fu Fudan University, Wenyun Zhao Fudan University, China Media Attached File Attached | ||
22:14 4mTalk | An Exploratory Study of Analyzing JavaScript Online Code Clones Early Research Achievements (ERA) DOI Pre-print Media Attached | ||
22:18 7mTalk | Exploring and Understanding Cross-service Code Clones in Microservice Projects Research Yang Zhao Central China Normal University, Ran Mo Central China Normal University, Yao Zhang Central China Normal University, Siyuan Zhang Central China Normal University, Pu Xiong Central China Normal University Media Attached | ||
22:25 7mTalk | MSCCD: Grammar Pluggable Clone Detection Based on ANTLR Parser Generation Research Wenqing ZHU Nagoya University, Norihiro Yoshida Ritsumeikan University, Toshihiro Kamiya Shimane University, Eunjong Choi Kyoto Institute of Technology, Hiroaki Takada Nagoya University Pre-print Media Attached | ||
22:32 7mTalk | Algorithm Identification in Programming Assignments Research Pranshu Chourasia Indian Institute of technology - Bombay, Ganesh Ramakrishnan Indian Institute of technology - Bombay, Varsha Apte Indian Institute of technology - Bombay, Suraj Kumar Indian Institute of technology - Bombay Media Attached | ||
22:39 11mLive Q&A | Q&A-Paper Session 10 Research |