ICSE 2023 (series) / ICPC 2023 (series) / Early Research Achievements (ERA) /
Investigating the Generalizability of Deep Learning-based Clone Detectors
The generalizability of Deep Learning (DL) models is a significant challenge, as poor generalizability indicates that the model has overfitted to the training data and is not able to generalize to new data. Despite numerous DL-based clone detectors emerging in recent years, their generalizability has not been thoroughly assessed. This study investigates the generalizability of three DL-based clone detectors (CCLearner, ASTNN, and CodeBERT) by comparing their detection accuracy on different training and testing datasets. The results show that all three clone detectors do not generalize well to new data and there is a strong relationship between clone types and generalizability for CCLearner and ASTNN.
Tue 16 MayDisplayed time zone: Hobart change
Tue 16 May
Displayed time zone: Hobart change