Write a Blog >>
ICSE 2020
Mon 5 - Sun 11 October 2020 Yongsan-gu, Seoul, South Korea
Mon 5 Oct 2020 17:35 - 17:50 at TBD7 - Clones and Changes

Context Code-free software similarity detection techniques have been used to support different software engineering tasks, including clustering mobile applications (apps). The way of measuring similarity may affect both the efficiency and quality of clustering solutions. However, there has been no previous comparative study of feature extraction methods used to guide mobile app clustering.

Objective In this paper, we investigate different techniques to compute the similarity of apps based on their textual descriptions and evaluate their effectiveness using hierarchical agglomerative clustering.

Method To this end we carry out an empirical study comparing five different techniques, based on topic modelling and keyword feature extraction, to cluster 12,664 apps randomly sampled from the Google Play App Store. The comparison is based on three main criteria: silhouette width measure, human judgement and execution time.

Results The results of our study show that using topic modelling, in addition to collocation-based and dependency-based feature extractors perform similarly in detecting app-feature similarity. However, dependency-based feature extraction performs better than any other in finding application domain similarity (ρ = 0.7, p − value < 0.01).

Conclusions Current categorisation in the app store studied does not exhibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method.

Mon 5 Oct

16:10 - 17:50: Paper Presentations - Clones and Changes at TBD7
icse-2020-papers16:10 - 16:30
Aishwarya SivaramanUniversity of California, Los Angeles, Jason LauUniversity of California, Los Angeles, Qian ZhangUCLA, Muhammad Ali GulzarUniversity of California, Los Angeles, Jason CongUCLA, Miryung KimUniversity of California, Los Angeles
icse-2020-papers16:30 - 16:50
Weijie ZhouNorth Carolina State University, Yue ZhaoNCSU, Guoqiang ZhangNorth Carolina State University, Xipeng ShenNorth Carolina State University
icse-2020-papers16:50 - 17:10
Thong HoangSingapore Management University, Singapore, Kang Hong JinSchool of Information Systems, Singapore Management University, Julia LawallInria/LIP6, David LoSingapore Management University
icse-2020-New-Ideas-and-Emerging-Results17:10 - 17:20
Sebastian Baltes The University of Adelaide, Christoph TreudeThe University of Adelaide
icse-2020-Journal-First17:20 - 17:35
Chaiyong RagkhitwetsagulMahidol University, Thailand, Jens KrinkeUniversity College London
icse-2020-Journal-First17:35 - 17:50
Afnan Al-SubaihinKing Saud University, Federica SarroUniversity College London, UK, Sue BlackDurham University, Licia CapraUniversity College London