Multiple-Boundary Clustering and Prioritization to Promote Neural Network Retraining (ASE 2020 - Research Papers)

Who

Weijun Shen, Yanhui Li, Lin Chen, YuanLei Han, Yuming Zhou, Baowen Xu

Track

ASE 2020 Research Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Sep 2020 00:00 - 00:20 at Kangaroo - Software Engineering for AI (1) Chair(s): Song Wang

Abstract

With the increasing application of deep learning (DL) models in many safety-critical scenarios, effective and efficient DL testing techniques are much in demand to improve the quality of DL models. One of the major challenges is the data gap between the training data to construct the models and the testing data to evaluate them. To bridge the gap, testers aims to collect an effective subset of inputs from the testing contexts, with limited labeling effort, for retraining DL models.

To assist the subset selection, we propose \textbf{M}ultiple-Boundary \textbf{C}lustering and \textbf{P}rioritization (\textbf{MCP}), a technique to cluster test samples into the boundary areas of multiple boundaries for DL models and specify the priority to select samples evenly from all boundary areas, to make sure enough useful samples for each boundary reconstruction.

To evaluate MCP, we conduct an extensive empirical study with three popular DL models and 33 simulated testing contexts. The experiment results show that, compared with state-of-the-art baseline methods, on effectiveness, our approach MCP has a significantly better performance by evaluating the improved quality of retrained DL models; on efficiency, MCP also has the advantages in time costs.

Weijun Shen

Nanjing University

Yanhui Li

Department of Computer Science and Technology, Nanjing University

Lin Chen

Nanjing University

China

YuanLei Han

Nanjing University

Yuming Zhou

Nanjing University

Baowen Xu

State Key Laboratory for Novel Software Technology, Nanjing University

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Sep
Displayed time zone: (UTC) Coordinated Universal Time change

00:00 - 01:00	Software Engineering for AI (1)NIER track / Research Papers at Kangaroo Chair(s): Song Wang York University, Canada

00:00 20m Talk		Multiple-Boundary Clustering and Prioritization to Promote Neural Network Retraining Research Papers Weijun Shen Nanjing University, Yanhui Li Department of Computer Science and Technology, Nanjing University, Lin Chen Nanjing University, YuanLei Han Nanjing University, Yuming Zhou Nanjing University, Baowen Xu State Key Laboratory for Novel Software Technology, Nanjing University
00:20 20m Talk		MARBLE: Model-Based Robustness Analysis of Stateful Deep Learning Systems Research Papers Xiaoning Du Nanyang Technological University, Yi Li Nanyang Technological University, Xiaofei Xie Nanyang Technological University, Lei Ma Kyushu University, Yang Liu Nanyang Technological University, Singapore, Jianjun Zhao Kyushu University
00:40 10m Talk		Making Fair ML Software using Trustworthy Explanation NIER track Joymallya Chakraborty North Carolina State University, USA, Kewen Peng North Carolina State University, Tim Menzies North Carolina State University, USA Link to publication DOI Pre-print Media Attached