ASE 2024
Sun 27 October - Fri 1 November 2024 Sacramento, California, United States
Tue 29 Oct 2024 14:30 - 14:45 at Magnoila - Android Chair(s): Ziyao He

Recent advances in Machine Learning (ML) have broadened the scope for automating diverse software engineering tasks through effective representation learning of software artifacts. Traditional methods often rely on manually selected, task-specific features, which can be imprecise and incomplete. In contrast, representation learning techniques, which allow the model itself to determine the most relevant features, offer a more scalable and generalizable approach. However, in the Android domain, models like apk2vec are limited by their focus on coarse-grained, whole-app level tasks or are too specific to a single task as in the case of smali2vec.

Our research contributes to this field by proposing DexBERT, a novel BERT-like model specifically developed for DEX bytecode, which forms the core binary format in Android applications. Inspired by the success of universal language models in natural language processing, DexBERT aims to abstract and encode deep semantic information from bytecode, facilitating its application to a variety of fine-grained class-level software engineering tasks. We evaluate DexBERT’s effectiveness in modeling the DEX language and its performance across three distinct tasks: Malicious Code Localization, Defect Prediction, and Component Type Classification. Our results indicate that DexBERT provides substantial improvements over existing approaches, achieving significant accuracy gains and demonstrating its generalizability across multiple tasks.

Furthermore, DexBERT addresses the challenge of variable application sizes and demonstrates robust performance even with apps of vastly different scales. This adaptability is critical for practical deployment in real-world scenarios where application size can vary greatly.

In summary, DexBERT not only advances the state of the art in Android app analysis but also sets a new standard for the development of fine-grained, task-agnostic models in software engineering. Our contribution is significant, as it enables the development of more versatile and efficient tools for software analysis, reducing the reliance on costly manual feature engineering and repetitive model training.

Tue 29 Oct

Displayed time zone: Pacific Time (US & Canada) change

13:30 - 15:00
AndroidJournal-first Papers / Research Papers / Industry Showcase at Magnoila
Chair(s): Ziyao He University of California, Irvine
13:30
15m
Talk
How Does Code Optimization Impact Third-party Library Detection for Android Applications?ACM SigSoft Distinguished Paper Award
Research Papers
Zifan Xie Huazhong University of Science and Technology, Ming Wen Huazhong University of Science and Technology, Tinghan Li Huazhong University of Science and Technology, Yiding Zhu Huazhong University of Science and Technology, Qinsheng Hou Shandong University; Qi An Xin Group Corp., Hai Jin Huazhong University of Science and Technology
Media Attached
13:45
15m
Talk
MaskDroid: Robust Android Malware Detection with Masked Graph Representations
Research Papers
Jingnan Zheng National University of Singapore, Jiahao Liu National University of Singapore, An Zhang , Jun ZENG Huawei, Ziqi Yang Zhejiang University, Zhenkai Liang National University of Singapore, Tat-Seng Chua National University of Singapore
14:00
15m
Talk
A Longitudinal Analysis Of Replicas in the Wild Wild Android
Research Papers
Syeda Mashal Abbas Zaidi University of Waterloo, Shahpar Khan University of Waterloo, Parjanya Vyas University of Waterloo, Yousra Aafer University of Waterloo
14:15
15m
Talk
Android Malware Family Labeling: Perspectives from the Industry
Industry Showcase
Liu Wang Beijing University of Posts and Telecommunications, Haoyu Wang Huazhong University of Science and Technology, Tao Zhang Macau University of Science and Technology, Haitao Xu Zhejiang University, Guozhu Meng Institute of Information Engineering, Chinese Academy of Sciences, Peiming Gao MYbank, Ant Group, Chen Wei MYbank, Ant Group, Yi Wang
14:30
15m
Talk
DexBERT: Effective, Task-Agnostic and Fine-grained Representation Learning of Android Bytecode
Journal-first Papers
Tiezhu Sun University of Luxembourg, Kevin Allix Independent Researcher, Kisub Kim Singapore Management University, Singapore, Xin Zhou Singapore Management University, Singapore, Dongsun Kim Korea University, David Lo Singapore Management University, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg
14:45
15m
Talk
Same App, Different Behaviors: Uncovering Device-specific Behaviors in Android Apps
Industry Showcase
Zikan Dong Beijing University of Posts and Telecommunications, Yanjie Zhao Huazhong University of Science and Technology, Tianming Liu Monash Univerisity, Chao Wang University of Southern California, Guosheng Xu Beijing University of Posts and Telecommunications, Guoai Xu Harbin Institute of Technology, Shenzhen, Lin Zhang The National Computer Emergency Response Team/Coordination Center of China (CNCERT/CC), Haoyu Wang Huazhong University of Science and Technology