Distribution Models for Falsification and Verification of DNNs
DNN validation and verification approaches that are input distribution agnostic waste effort on irrelevant inputs and report false property violations. Drawing on the large body of work on model-based validation and verification of traditional systems, we introduce the first approach that leverages environmental models to focus DNN falsification and verification on the relevant input space. Our approach, DFV, automatically builds an input distribution model using unsupervised learning, prefixes that model to the DNN to force all inputs to come from the learned distribution, and reformulates the property to the input space of the distribution model. This transformed verification problem allows existing DNN falsification and verification tools to target the input distribution – avoiding consideration of infeasible inputs. Our study of DFV with 7 falsification and verification tools, two DNNs defined over different data sets, and 93 distinct distribution models, provides clear evidence that the counter-examples found by the tools are much more representative of the data distribution, and it shows how the performance of DFV varies across domains, models, and tools.