MalWhiteout: Reducing Label Errors in Android Malware DetectionVirtual
Machine learning based Android malware detection has attracted a great deal of research work in recent years. A reliable malware dataset is critical to evaluate the effectiveness of malware detection approaches. Unfortunately, existing malware datasets used in our community are mainly labelled by taking advantage of existing anti-virus services (i.e., VirusTotal), which are prone to mislabelling. This, however, would lead to the inaccurate evaluation of the malware detection techniques. Removing the label noises from Android malware datasets can be quite challenging, especially at a large data scale. To address this problem, we propose an effective approach called MalWhiteout to reduce the label errors in Android malware datasets. Specifically, we creatively introduce Confident Learning (CL), an advanced noise estimation approach, to the domain of Android malware detection. To combat false positives introduced by CL, we incorporate the idea of ensemble learning and inter-app relation to achieve a more robust capability in noise detection. We evaluate MalWhiteout on a curated large-scale and reliable benchmark dataset. Experimental results show that MalWhiteout is capable of detecting label noises with over 94% accuracy even at a high noise ratio (i.e., 30%) of the dataset. MalWhiteout outperforms the state-of-the-art approach in terms of both effectiveness (8% to 218% improvement) and efficiency (70 to 249 times faster) across different settings. By reducing label noises, we further show that the performance of existing malware detection approaches can be improved.
Tue 11 OctDisplayed time zone: Eastern Time (US & Canada) change
10:30 - 12:30 | Technical Session 4 - Mobile Apps IResearch Papers / NIER Track / Industry Showcase / Journal-first Papers / Tool Demonstrations at Gold A Chair(s): Jacques Klein University of Luxembourg | ||
10:30 20mResearch paper | Mining Android API Usage to Generate Unit Test Cases for Pinpointing Compatibility Issues Research Papers Xiaoyu Sun Monash University, Xiao Chen Monash University, Yanjie Zhao Monash University, Pei Liu Monash University, John Grundy Monash University, Li Li Monash University DOI Pre-print | ||
10:50 20mPaper | Automated, Cost-effective, and Update-driven App TestingVirtual Journal-first Papers Chanh-Duc Ngo University of Luxembourg, Fabrizio Pastore University of Luxembourg, Lionel Briand University of Luxembourg; University of Ottawa Link to publication | ||
11:10 20mIndustry talk | Fastbot2: Reusable Automated Model-based GUI Testing for Android Enhanced by Reinforcement LearningVirtual Industry Showcase Zhengwei Lv ByteDance, Chao Peng ByteDance, China, Zhao Zhang Bytedance Network Technology, Ting Su East China Normal University, Kai Liu Bytedance, Ping Yang Bytedance Network Technology | ||
11:30 10mVision and Emerging Results | Right to Know, Right to Refuse: Towards UI Perception-Based Automated Fine-Grained Permission Controls for Android AppsVirtual NIER Track Vikas K. Malviya Singapore Management University, Chee Wei Leow Singapore Management University, Ashok Kasthuri Singapore Management University, Yan Naing Tun Singapore Management University, Lwin Khin Shar Singapore Management University, Lingxiao Jiang Singapore Management University Pre-print Media Attached | ||
11:40 20mResearch paper | MalWhiteout: Reducing Label Errors in Android Malware DetectionVirtual Research Papers Liu Wang Beijing University of Posts and Telecommunications, Haoyu Wang Huazhong University of Science and Technology, China, Xiapu Luo Hong Kong Polytechnic University, Yulei Sui University of Technology Sydney | ||
12:00 10mDemonstration | AUSERA: Automated Security Vulnerability Detection for Android AppsVirtual Tool Demonstrations Sen Chen Tianjin University, Yuxin Zhang Tianjin University, Lingling Fan Nankai University, Jiaming Li Tianjin University, Yang Liu Nanyang Technological University | ||
12:10 20mResearch paper | A Comprehensive Evaluation of Android ICC Resolution TechniquesVirtual Research Papers Jiwei Yan Institute of Software at Chinese Academy of Sciences, China, Shixin Zhang Beijing Jiaotong University, China, Yepang Liu Southern University of Science and Technology, Xi Deng Institute of Software, Chinese Academy of Sciences, Jun Yan Institute of Software at Chinese Academy of Sciences, China, Jian Zhang Institute of Software at Chinese Academy of Sciences, China DOI Pre-print |