TestifAI: Probabilistic Context-Aware Testing For Safe Deep Learning Models
Deep neural networks (DNNs), are integral to AI systems deployed in safety-critical domains like autonomous vehicles, healthcare, and cybersecurity. However, assessing robustness in real-world scenarios remains a significant challenge. Traditional metrics like neuron coverage often fail to capture the sensitivity of DNN models towards input perturbations and increase the risk of unexpected failures. This paper introduces a probabilistic framework, TestifAI, to comprehensively evaluate and enhance DNN robustness across diverse and context-sensitive scenarios. TestifAI includes four key stages: (1) specification where, for a given DNN model and dataset, users specify test criteria in the form of robustness properties (e.g., sensitivity to image rotation), their configuration (e.g., range of rotation angles), and statistical dependencies between properties and configurations to simulate real-world scenarios; (2) mapping constructs a probabilistic coverage graph that captures the aforementioned dependencies, updating probabilities as testing progresses; (3) test-case generation systematically produces targeted unit tests aligned with specified robustness properties given their configuration; and, finally, (4) execution where, given tests, the framework calculates both local (property-specific) and global (system-wide, based on dependencies) coverage metrics. Experimental results demonstrate that the TestifAI framework offers a rigorous and adaptable approach to DNN testing, suitable for complex, mission-critical, and safety-sensitive environments.