An Empirical Evaluation of Mutation Operators for Deep Learning Systems
Deep Learning (DL) is increasingly adopted to solve complex tasks such as image recognition or autonomous driving. Companies are considering the inclusion of DL components in production systems, but one of their main concerns is how to assess the quality of such systems. Mutation testing is a technique to inject artificial faults into a system, under the assumption that the capability to expose (kilt) such artificial faults translates into the capability to expose also real faults. Researchers have proposed approaches and tools (e.g., Deep-Mutation and MuNN) that make mutation testing applicable to deep learning systems. However, existing definitions of mutation killing, based on accuracy drop, do not take into account the stochastic nature of the training process (accuracy may drop even when re-training the un-mutated system). Moreover, the same mutation operator might be effective or might be trivial/impossible to kill, depending on its hyper-parameter configuration. We conducted an empirical evaluation of existing operators, showing that mutation killing requires a stochastic definition and identifying the subset of effective mutation operators together with the associated most effective configurations.
Mon 17 AprDisplayed time zone: Dublin change
16:00 - 18:00 | Session 5: Testing AI/ML systemsResearch Papers / Previous Editions at Grand canal Chair(s): Jie M. Zhang King's College London | ||
16:00 20mTalk | Robustness assessment and improvement of a neural network for blood oxygen pressure estimation Previous Editions Paolo Arcaini National Institute of Informatics
, Andrea Bombarda University of Bergamo, Silvia Bonfanti University of Bergamo, Angelo Gargantini University of Bergamo, Daniele Gamba AISent S.r.l., Rita Pedercini AISent S.r.l. DOI | ||
16:20 20mTalk | An Empirical Evaluation of Mutation Operators for Deep Learning Systems Previous Editions DOI | ||
16:40 20mTalk | Distributed Repair of Deep Neural Networks Research Papers Davide Li Calsi Politecnico di Milano, Matias Duran National Institute of Informatics, Xiao-Yi Zhang School of Computer and Communication Engineering, University of Science and Technology Beijing, Paolo Arcaini National Institute of Informatics
, Fuyuki Ishikawa National Institute of Informatics | ||
17:00 20mTalk | Mutation Testing of Deep Reinforcement Learning Based on Real Faults Research Papers Florian Tambon Polytechnique Montréal, Vahid Majdinasab Polytechnique Montréal, Amin Nikanjam École Polytechnique de Montréal, Foutse Khomh Polytechnique Montréal, Giuliano Antoniol Polytechnique Montréal Pre-print | ||
17:20 20mTalk | Repairing DNN Architecture: Are We There Yet? Research Papers Jinhan Kim KAIST, Nargiz Humbatova USI Lugano, Gunel Jahangirova King's College London, Paolo Tonella USI Lugano, Shin Yoo KAIST Pre-print |