Multimodal Surprise Adequacy Analysis of Inputs for Natural Language Processing DNN Models
As Deep Neural Networks (DNNs) are rapidly adopted in various domains, many test adequacy metrics for DNN inputs have been introduced to help evaluating, and validating, trained DNN models. Surprise Adequacy (SA) is one such metric that aims to quantitatively measure how surprising a new input is with respect to the data used to train the given model. While SA has been shown to be effective for computer vision tasks such as image classification or object segmentation, its efficacy for DNN based Natural Language Processing has not been thoroughly studied. This paper evaluates whether it is feasible to apply SA analysis to DNN models trained for NLP tasks. We also show that the input distribution captured in the latent embedding space can be multimodal for some NLP tasks, unlike those observed in computer vision tasks, and investigate if catering for the multimodal property of NLP models can improve SA analysis. An empirical evaluation of extended SA metrics with three NLP tasks and nine DNN models shows that, while unimodal SAs perform sufficiently well for text classification, multimodal SA can outperform unimodal metrics.
Fri 21 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
12:00 - 13:15 | |||
12:00 30mLong-paper | An Evolutionary Approach to Adapt Tests Across Mobile Apps AST 2021 Leonardo Mariani University of Milano Bicocca, Mauro Pezze USI Lugano, Switzerland, Valerio Terragni The University of Auckland, Daniele Zuddas Università della Svizzera italiana (USI) Pre-print Media Attached | ||
12:30 15mShort-paper | A framework for the automation of testing computer vision systems AST 2021 Franz Wotawa , Ledio Jahaj Technische Universitaet Graz, Lorenz Klampfl Graz University of Technology, Austria Pre-print Media Attached | ||
12:45 30mLong-paper | Multimodal Surprise Adequacy Analysis of Inputs for Natural Language Processing DNN Models AST 2021 Pre-print Media Attached |
Go directly to this room on Clowdr