DeepSHAP Summary for Adversarial Example Detection
The applications of deep learning are widely used in many fields; in addition, Explainable AI (XAI) helps interpret the model predictions to make them more reliable and trustworthy. Taking advantage of this characteristic, we propose the adoption of decision logic from the explanation for detecting adversarial examples. We show that there exists a different interpretation distribution between normal and adversarial examples, as well as diverse decision logic of networks to distinguish them. Specifically, we first use the DeepSHAP values to calculate the neuron contribution of the classification model layer-by-layer and show how to utilize this layer-wise explanation to distinguish between normal and adversarial examples. Second, we select critical neurons by their SHAP values, generate a bitmap to represent the distribution of critical neurons, and then use the bitmap instead of SHAP values to detect adversarial examples. The preliminary results against the CIFAR-10 dataset demonstrate that the more SHAP layer information is given, the better accuracy can be achieved; furthermore, using the critical neuron path, which concentrates on critical neurons, has achieved a better detection accuracy compared to single/multi layer SHAP values and a comparable accuracy to all layer SHAP values, but reduced significant amount of computation cost on training and testing processes for adversarial example detection.
Mon 15 MayDisplayed time zone: Hobart change
13:45 - 15:15 | |||
13:45 20mTalk | Metamorphic Testing of Machine Translation Models using Back Translation DeepTest | ||
14:05 20mTalk | A Method of Identifying Causes of Prediction Errors to Accelerate MLOps DeepTest | ||
14:25 20mTalk | DeepSHAP Summary for Adversarial Example Detection DeepTest | ||
14:45 20mTalk | DeepPatch: A Patching-Based Method for Repairing Deep Neural Networks DeepTest |