ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Thu 18 Apr 2024 14:30 - 14:45 at Maria Helena Vieira da Silva - Analytics 3 Chair(s): Sridhar Chimalakonda

Numerous tools rely on automatic categorization of Android apps as part of their methodology. However, incorrect categorization can lead to inaccurate outcomes, such as a malware detector wrongly flagging a benign app as malicious. One such example is the \textit{SlideIT Free Keyboard} app, which has over 500000 downloads on Google Play. Despite being a “Keyboard” app, it is often wrongly categorized alongside “Language” apps due to the app’s description focusing heavily on language support, resulting in incorrect analysis outcomes, including mislabeling it as a potential malware when it is actually a benign app. Hence, there is a need to improve the categorization of Android apps to benefit all the tools relying on it.

In this paper, we present a comprehensive evaluation of existing Android app categorization approaches using our new ground-truth dataset. Our evaluation demonstrates the notable superiority of approaches that utilize app descriptions over those solely relying on data extracted from the APK file, while also leaving space for potential improvement in the former category. Thus, we propose two innovative approaches that effectively outperform the performance of existing methods in both description-based and APK-based methodologies. Finally, by employing our novel description-based approach, we have successfully demonstrated that adopting a higher-performing categorization method can significantly benefit tools reliant on app categorization, leading to an improvement in their overall performance. This highlights the significance of developing advanced and efficient app categorization methodologies for improved results in software engineering tasks.

Thu 18 Apr

Displayed time zone: Lisbon change

14:00 - 15:30
Analytics 3Research Track / Journal-first Papers / Demonstrations at Maria Helena Vieira da Silva
Chair(s): Sridhar Chimalakonda Indian Institute of Technology, Tirupati
14:00
15m
Talk
Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem
Research Track
Yun Peng The Chinese University of Hong Kong, Ruida Hu Harbin Institute of Technology, Shenzhen, Ruoke Wang Harbin Institute of Technology, Shenzhen, Cuiyun Gao Harbin Institute of Technology, Shuqing Li The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong
14:15
15m
Talk
Data-Driven Evidence-Based Syntactic Sugar Design
Research Track
David OBrien Iowa State University, Robert Dyer University of Nebraska-Lincoln, Tien N. Nguyen University of Texas at Dallas, Hridesh Rajan Iowa State University
14:30
15m
Talk
Revisiting Android App Categorization
Research Track
Marco Alecci University of Luxembourg, Jordan Samhi CISPA Helmholtz Center for Information Security, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg
14:45
15m
Talk
Are Your Requests Your True Needs? Checking Excessive Data Collection in VPA App
Research Track
Fuman Xie University of Queensland, Chuan Yan University of Queensland, Mark Huasong Meng National University of Singapore, Shaoming Teng The University of Queensland, Yanjun Zhang Deakin University, Guangdong Bai University of Queensland
15:00
7m
Talk
Acrobats and Safety-Nets: Problematizing Large-Scale Agile Software Development
Journal-first Papers
Knut Rolland University of Oslo, Brian Fitzgerald Lero - The Irish Software Research Centre and University of Limerick, Torgeir Dingsøyr Norwegian University of Science and Technology and SimulaMet, Klaas-Jan Stol Lero; University College Cork; SINTEF Digital
Link to publication DOI
15:07
7m
Talk
Program Transformation Landscapes for Automated Program Modification Using Gin: Extended Abstract
Journal-first Papers
Justyna Petke University College London, Brad Alexander University of Adelaide, Earl T. Barr University College London, Alexander E.I. Brownlee University of Stirling, Markus Wagner Monash University, Australia, David R. White University of Sheffield
15:14
7m
Talk
Boidae: Your Personal Mining Platform
Demonstrations
Brian Sigurdson Bowling Green State University, Samuel W. Flint University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln
Pre-print Media Attached
15:21
7m
Talk
Code Mapper: Mapping the Global Contributions of OSS
Demonstrations
Thomas Le Tourneau CY Tech, Jasmine Latendresse Concordia University, Ahmad Abdellatif University of Calgary, Emad Shihab Concordia University