Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation (ICSE 2023 - Technical Track)

Who

Qiang Hu, Yuejun GUo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

Track

ICSE 2023 Technical Track

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 May 2023 11:00 - 11:15 at Meeting Room 101 - AI testing 2 Chair(s): Gunel Jahangirova

Abstract

Deep learning (DL) plays a more and more important role in our daily life due to its competitive performance in industrial application domains. As the core of DL-enabled systems, deep neural networks (DNNs) need to be carefully evaluated to ensure the quality of the trained model meets the demand. In practice, the \emph{de facto standard} to assess the quality of DNNs in the industry is to check their performance (accuracy) on a collected set of labeled test data. However, preparing such labeled data is often not easy partly because of the huge labeling effort, i.e., data labeling is labor-intensive, especially with the massive new incoming unlabeled data every day. Recent studies show that test selection for DNN is a promising direction that tackles this issue by selecting minimal representative data to label and using these data to assess the model. However, it still requires human effort and cannot be automatic. In this paper, we propose a novel technique, named \textit{Aries}, that can estimate the performance of DNNs on new unlabeled data using only the information obtained from the original test data. The key insight behind our technique is that the model should have similar prediction accuracy on the data which have similar distances to the decision boundary. We performed a large-scale evaluation of our technique on 2 famous datasets, CIFAR-10 and Tiny-ImageNet, 4 widely studied DNN models including ResNet101 and DenseNet121, and 13 types of data transformation methods. Results show that the estimated accuracy by \textit{Aries} is only 0.03% – 2.60% off the true accuracy. Besides, \textit{Aries} also outperforms the state-of-the-art labeling-free methods in 50 out of 52 cases and selection-labeling-based methods in 96 out of 128 cases.

Link to Preprint

https://mpapad.github.io//publications/pdfs/ICSE2023_Acc_Estimation.pdf

Qiang Hu

University of Luxembourg

Yuejun GUo

University of Luxembourg

Xiaofei Xie

Singapore Management University

Singapore

Maxime Cordy

University of Luxembourg, Luxembourg

Lei Ma

University of Alberta

Canada

Mike Papadakis

University of Luxembourg, Luxembourg

Luxembourg

Yves Le Traon

University of Luxembourg, Luxembourg

Luxembourg

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 19 May
Displayed time zone: Hobart change

11:00 - 12:30	AI testing 2Technical Track / Journal-First Papers at Meeting Room 101 Chair(s): Gunel Jahangirova USI Lugano, Switzerland

11:00 15m Talk		Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation Technical Track Qiang Hu University of Luxembourg, Yuejun GUo University of Luxembourg, Xiaofei Xie Singapore Management University, Maxime Cordy University of Luxembourg, Luxembourg, Lei Ma University of Alberta, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg Pre-print
11:15 15m Talk		Testing the Plasticity of Reinforcement Learning Based Systems Journal-First Papers Matteo Biagiola Università della Svizzera italiana, Paolo Tonella USI Lugano Link to publication DOI Pre-print
11:30 15m Talk		CC: Causality-Aware Coverage Criterion for Deep Neural Networks Technical Track Zhenlan Ji The Hong Kong University of Science and Technology, Pingchuan Ma HKUST, Yuanyuan Yuan The Hong Kong University of Science and Technology, Shuai Wang Hong Kong University of Science and Technology
11:45 15m Talk		Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests Technical Track Chunqiu Steven Xia University of Illinois at Urbana-Champaign, Saikat Dutta University of Illinois at Urbana-Champaign, Sasa Misailovic University of Illinois at Urbana-Champaign, Darko Marinov University of Illinois at Urbana-Champaign, Lingming Zhang University of Illinois at Urbana-Champaign
12:00 15m Talk		Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems Technical Track Fitash ul haq , Donghwan Shin The University of Sheffield, Lionel Briand University of Luxembourg; University of Ottawa Pre-print
12:15 15m Talk		Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects Technical Track Linyi Li University of Illinois at Urbana-Champaign, Yuhao Zhang University of Wisconsin-Madison, Luyao Ren Peking University, China, Yingfei Xiong Peking University, Tao Xie Peking University Pre-print