Write a Blog >>
ICSE 2023
Sun 14 - Sat 20 May 2023 Melbourne, Australia
Fri 19 May 2023 11:00 - 11:15 at Meeting Room 101 - AI testing 2 Chair(s): Gunel Jahangirova

Deep learning (DL) plays a more and more important role in our daily life due to its competitive performance in industrial application domains. As the core of DL-enabled systems, deep neural networks (DNNs) need to be carefully evaluated to ensure the quality of the trained model meets the demand. In practice, the \emph{de facto standard} to assess the quality of DNNs in the industry is to check their performance (accuracy) on a collected set of labeled test data. However, preparing such labeled data is often not easy partly because of the huge labeling effort, i.e., data labeling is labor-intensive, especially with the massive new incoming unlabeled data every day. Recent studies show that test selection for DNN is a promising direction that tackles this issue by selecting minimal representative data to label and using these data to assess the model. However, it still requires human effort and cannot be automatic. In this paper, we propose a novel technique, named \textit{Aries}, that can estimate the performance of DNNs on new unlabeled data using only the information obtained from the original test data. The key insight behind our technique is that the model should have similar prediction accuracy on the data which have similar distances to the decision boundary. We performed a large-scale evaluation of our technique on 2 famous datasets, CIFAR-10 and Tiny-ImageNet, 4 widely studied DNN models including ResNet101 and DenseNet121, and 13 types of data transformation methods. Results show that the estimated accuracy by \textit{Aries} is only 0.03% – 2.60% off the true accuracy. Besides, \textit{Aries} also outperforms the state-of-the-art labeling-free methods in 50 out of 52 cases and selection-labeling-based methods in 96 out of 128 cases.

Fri 19 May

Displayed time zone: Hobart change

11:00 - 12:30
AI testing 2Technical Track / Journal-First Papers at Meeting Room 101
Chair(s): Gunel Jahangirova USI Lugano, Switzerland
11:00
15m
Talk
Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation
Technical Track
Qiang Hu University of Luxembourg, Yuejun GUo University of Luxembourg, Xiaofei Xie Singapore Management University, Maxime Cordy University of Luxembourg, Luxembourg, Lei Ma University of Alberta, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print
11:15
15m
Talk
Testing the Plasticity of Reinforcement Learning Based Systems
Journal-First Papers
Matteo Biagiola Università della Svizzera italiana, Paolo Tonella USI Lugano
Link to publication DOI Pre-print
11:30
15m
Talk
CC: Causality-Aware Coverage Criterion for Deep Neural Networks
Technical Track
Zhenlan Ji The Hong Kong University of Science and Technology, Pingchuan Ma HKUST, Yuanyuan Yuan The Hong Kong University of Science and Technology, Shuai Wang Hong Kong University of Science and Technology
11:45
15m
Talk
Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests
Technical Track
Chunqiu Steven Xia University of Illinois at Urbana-Champaign, Saikat Dutta University of Illinois at Urbana-Champaign, Sasa Misailovic University of Illinois at Urbana-Champaign, Darko Marinov University of Illinois at Urbana-Champaign, Lingming Zhang University of Illinois at Urbana-Champaign
12:00
15m
Talk
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Technical Track
Fitash ul haq , Donghwan Shin The University of Sheffield, Lionel Briand University of Luxembourg; University of Ottawa
Pre-print
12:15
15m
Talk
Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects
Technical Track
Linyi Li University of Illinois at Urbana-Champaign, Yuhao Zhang University of Wisconsin-Madison, Luyao Ren Peking University, China, Yingfei Xiong Peking University, Tao Xie Peking University
Pre-print