ISSTA 2022
Mon 18 - Fri 22 July 2022 Online
Wed 20 Jul 2022 07:40 - 08:00 at ISSTA 2 - Session 2-2: Neural Networks, Learning, NLP E
Thu 21 Jul 2022 16:40 - 17:00 at ISSTA 2 - Session 3-6: Neural Networks, Learning, NLP F

A well-trained deep learning (DL) model often cannot achieve expected performance after deployment due to the mismatch between the distributions of the training data and the field data in the operational environment. Therefore, repairing DL models is critical, especially when deployed on increasingly larger tasks with shifted distributions.

Generally speaking, it is easy to obtain a large amount of field data. Existing solutions develop various techniques to select a subset for annotation and then fine-tune the model for repair. While effective, achieving a higher repair rate is inevitably associated with more expensive labeling costs. To mitigate this problem, we propose a novel annotation-efficient repair solution for DL models, namely \emph{HybridRepair}, wherein we take a holistic approach that coordinates the use of a small amount of annotated data and a large amount of unlabeled data for repair. Our key insight is that \emph{accurate yet sufficient} training data is needed to repair the corresponding failure region in the data distribution. Under a given labeling budget, we selectively annotate some data in each failure region and propagate their labels to the neighboring data on the one hand. On the other hand, we take advantage of the semi-supervised learning (SSL) techniques to further boost the training data density. However, different from existing SSL solutions that try to use all the unlabeled data, we only use a selected part of them considering the impact of distribution shift on SSL solutions. Experimental results show that \emph{HybridRepair} outperforms both state-of-the-art DL model repair solutions and semi-supervised techniques for model improvements, especially when there is a distribution shift between the training data and the field data. Our code is available at: \url{https://doi.org/10.5281/zenodo.5914559}.

Wed 20 Jul

Displayed time zone: Seoul change

07:00 - 08:20
Session 2-2: Neural Networks, Learning, NLP ETechnical Papers at ISSTA 2
07:00
20m
Talk
Cross-Lingual Transfer Learning for Statistical Type InferenceACM SIGSOFT Distinguished Paper
Technical Papers
Zhiming Li Nanyang Technological University, Singapore, Xiaofei Xie Singapore Management University, Singapore, Haoliang Li City University of Hong Kong, Zhengzi Xu Nanyang Technological University, Yi Li Nanyang Technological University, Singapore, Yang Liu Nanyang Technological University
DOI
07:20
20m
Talk
DocTer: Documentation-Guided Fuzzing for Testing Deep Learning API Functions
Technical Papers
Danning Xie Purdue University, Yitong Li University of Waterloo, Mijung Kim UNIST, Hung Viet Pham University of Waterloo, Lin Tan Purdue University, Xiangyu Zhang Purdue University, Michael W. Godfrey University of Waterloo, Canada
DOI
07:40
20m
Talk
HybridRepair: Towards Annotation-Efficient Repair for Deep Learning Models
Technical Papers
Yu Li The Chinese University of Hong Kong, Muxi Chen The Chinese University of Hong Kong, Xu, Qiang
DOI
08:00
20m
Talk
Human-in-the-Loop Oracle Learning for Semantic Bugs in String Processing Programs
Technical Papers
Charaka Geethal Monash University, Thuan Pham The University of Melbourne, Aldeida Aleti Monash University, Marcel Böhme MPI-SP, Germany and Monash University, Australia
DOI Pre-print

Thu 21 Jul

Displayed time zone: Seoul change

16:20 - 17:40
Session 3-6: Neural Networks, Learning, NLP FTechnical Papers at ISSTA 2
16:20
20m
Talk
AEON: A Method for Automatic Evaluation of NLP Test Cases
Technical Papers
Jen-tse Huang The Chinese University of Hong Kong, Jianping Zhang The Chinese University of Hong Kong, Wenxuan Wang The Chinese University of Hong Kong, Pinjia He The Chinese University of Hong Kong, Shenzhen, Yuxin Su Sun Yat-sen University, Michael Lyu The Chinese University of Hong Kong
DOI
16:40
20m
Talk
HybridRepair: Towards Annotation-Efficient Repair for Deep Learning Models
Technical Papers
Yu Li The Chinese University of Hong Kong, Muxi Chen The Chinese University of Hong Kong, Xu, Qiang
DOI
17:00
20m
Talk
Improving Cross-Platform Binary Analysis using Representation Learning via Graph Alignment
Technical Papers
Geunwoo Kim University of California, Irvine, USA, Sanghyun Hong Oregon State University, Michael Franz University of California, Irvine, Dokyung Song Yonsei University, South Korea
DOI
17:20
20m
Talk
Human-in-the-Loop Oracle Learning for Semantic Bugs in String Processing Programs
Technical Papers
Charaka Geethal Monash University, Thuan Pham The University of Melbourne, Aldeida Aleti Monash University, Marcel Böhme MPI-SP, Germany and Monash University, Australia
DOI Pre-print