Active learning is an established technique to reduce the labeling cost for building high-quality machine learning models. However, state-of-the-art approaches focus on maximizing the clean performance (e.g. accuracy) but disregarding robustness. In this work, we propose Robust Active Learning, an active learning process that integrates adversarial training – the most established method to produce robust models. First, we conduct an empirical study to evaluate the effectiveness of existing approaches and uncover the characteristics of data. Then, we propose a novel approach, density-based robust sampling with entropy (DRE), to target both clean performance and robustness. Our experiments are conducted on 11 acquisition functions, 4 datasets, 6 DNN architectures, and 15105 trained DNNs.
Markus Haug University of Stuttgart, Institute of Software Engineering, Empirical Software Engineering Group, Justus Bogner University of Stuttgart, Institute of Software Engineering, Empirical Software Engineering Group
Yuejun GUo Interdisciplinary Centre for Security, Qiang Hu University of Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg