As Deep Neural Networks (DNNs) are rapidly being adopted within large software systems, software developers are increasingly required to design, train, and deploy such models into the systems they develop. Consequently, testing and improving the robustness of these models have received a lot of attention lately. However, relatively little effort has been made to address the difficulties developers experience when designing and training such models: if the evaluation of a model shows poor performance after the initial training, what should the developer change? We survey and evaluate existing state-of-the-art techniques that can be used to repair model performance, using a benchmark of both real-world mistakes developers made while designing DNN models and artificial faulty models generated by mutating the model code. The empirical evaluation shows that random baseline is comparable with or sometimes outperforms existing state-of-the-art techniques. However, for larger and more complicated models, all repair techniques fail to find fixes. Our findings call for further research to develop more sophisticated techniques for Deep Learning repair.
Mon 17 AprDisplayed time zone: Dublin change
16:00 - 18:00 | Session 5: Testing AI/ML systemsResearch Papers / Previous Editions at Grand canal Chair(s): Jie M. Zhang King's College London | ||
16:00 20mTalk | Robustness assessment and improvement of a neural network for blood oxygen pressure estimation Previous Editions Paolo Arcaini National Institute of Informatics
, Andrea Bombarda University of Bergamo, Silvia Bonfanti University of Bergamo, Angelo Gargantini University of Bergamo, Daniele Gamba AISent S.r.l., Rita Pedercini AISent S.r.l. DOI | ||
16:20 20mTalk | An Empirical Evaluation of Mutation Operators for Deep Learning Systems Previous Editions DOI | ||
16:40 20mTalk | Distributed Repair of Deep Neural Networks Research Papers Davide Li Calsi Politecnico di Milano, Matias Duran National Institute of Informatics, Xiao-Yi Zhang School of Computer and Communication Engineering, University of Science and Technology Beijing, Paolo Arcaini National Institute of Informatics
, Fuyuki Ishikawa National Institute of Informatics | ||
17:00 20mTalk | Mutation Testing of Deep Reinforcement Learning Based on Real Faults Research Papers Florian Tambon Polytechnique Montréal, Vahid Majdinasab Polytechnique Montréal, Amin Nikanjam École Polytechnique de Montréal, Foutse Khomh Polytechnique Montréal, Giuliano Antoniol Polytechnique Montréal Pre-print | ||
17:20 20mTalk | Repairing DNN Architecture: Are We There Yet? Research Papers Jinhan Kim KAIST, Nargiz Humbatova USI Lugano, Gunel Jahangirova King's College London, Paolo Tonella USI Lugano, Shin Yoo KAIST Pre-print |