On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering
Mutation analysis of deep neural networks (DNNs) is a promising method for effective evaluation of test data quality and model robustness, but it can be computationally expensive, especially for large models. To alleviate this, we present DEEPMAACC, a technique and a tool that speeds up DNN mutation analysis through neuron and mutant clustering. DEEPMAACC implements two methods: (1) neuron clustering to reduce the number of generated mutants and (2) mutant clustering to reduce the number of mutants to be tested by selecting representative mutants for testing. Both use hierarchical agglomerative clustering to group neurons and mutants with similar weights, with the goal of improving efficiency while maintaining mutation score.
DEEPMAACC has been evaluated on 8 DNN models across 4 popular classification datasets and two DNN architectures. When compared to exhaustive, or vanilla, mutation analysis, the results provide empirical evidence that neuron clustering approach, on average, accelerates mutation analysis by 72.44%, with an average -27.84% error in mutation score. Meanwhile, mutant clustering approach, on average, accelerates mutation analysis by 39.48%, with an average -1.64% error in mutation score. Our results demonstrate that a trade-off can be made between mutation testing speed and mutation score error.
Thu 3 AprDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:30 | Testing ML Systems and Fault LocalisationIndustry / Research Papers at Aula Magna (AM) Chair(s): Atif Memon Apple | ||
11:00 15mTalk | On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering Research Papers Pre-print | ||
11:15 15mTalk | Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems Research Papers Stefano Carlo Lambertenghi Technische Universität München, fortiss GmbH, Hannes Leonhard Technical University of Munich, Andrea Stocco Technical University of Munich, fortiss Pre-print | ||
11:30 15mTalk | Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code Research Papers Shahin Honarvar Imperial College London, Mark van der Wilk University of Oxford, Alastair F. Donaldson Imperial College London | ||
11:45 15mTalk | Taming Uncertainty for Critical Scenario Generation in Automated Driving Industry Selma Grosse DENSO Automotive GmbH, Dejan Nickovic Austrian Institute of Technology, Cristinel Mateis AIT Austrian Institute of Technology GmbH, Alessio Gambi Austrian Institute of Technology (AIT), Adam Molin DENSO AUTOMOTIVE | ||
12:00 15mTalk | Multi-Project Just-in-Time Software Defect Prediction Based on Multi-Task Learning for Mobile Applications Research Papers Feng Chen Chongqing University of Posts and Telecommunications, Ke Yuxin Chongqing University of Posts and Telecommunications, Liu Xin Chongqing University of Posts and Telecommunications, Wei Qingjie Chongqing University of Posts and Telecommunications | ||
12:15 15mTalk | Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces Industry Neetha Jambigi University of Cologne, Bartosz Bogacz SAP SE, Moritz Mueller SAP SE, Thomas Bach SAP, Michael Felderer German Aerospace Center (DLR) & University of Cologne |