CCGraph: a PDG-based code clone detector with approximate graph matching
The software clone detection is an active research area, which is very important for software maintenance, bug detection etc. The two pieces of cloned code reflect some similarities or equivalents in the syntax or structure of the code representations. There are many representations of code like AST, token, PDG etc. The PDG (Program Dependency Graph) of source code can contain both syntactic and structural information. However, most existing PDG-based tools have high time consuming and miss many clones because they detect code clones with exact graph matching by using subgraph isomorphism. In this paper, we propose a novel PDG-based code clone detector, CCGraph, that uses graph kernels. Firstly, we normalize the structure of PDGs and design a two-stage filtering strategy by measuring the characteristic vectors of codes. Then we detect the code clones by using approximate graph matching algorithm based on the reforming WL (Weisfeiler-Lehman) graph kernel. Experiment results show that CCGraph retains a high accuracy, has both better recall and F1-score values, and detects more unique clones than other two related state-of-the-art tools. Besides, CCGraph is much more efficient than the existing PDG-based tools.
Thu 24 SepDisplayed time zone: (UTC) Coordinated Universal Time change
02:20 - 03:20
|CCGraph: a PDG-based code clone detector with approximate graph matching|
|Towards Generating Thread-Safe Classes Automatically|
Yi Liu Southern University of Science and Technology, Jinhui Xie Tencent Inc., Jianbo Yang Tencent Inc., Shiyu Guo Tencent Inc., Yuetang Deng Tencent, Inc., Shuqing Li Southern University of Science and Technology, Yechang Wu Southern University of Science and Technology, Yepang Liu Southern University of Science and Technology