One Step Further: Evaluating Interpreters Using Metamorphic Testing (ISSTA 2022 - Technical Papers)

Who

Ming Fan, Jiali Wei, Wuxia Jin, Zhou Xu, Wenying Wei, Ting Liu

Track

ISSTA 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 20 Jul 2022 02:00 - 02:20 at ISSTA 2 - Session 1-2: Test Generation and Mutation A Chair(s): Raghavan Komondoor
Wed 20 Jul 2022 18:00 - 18:20 at ISSTA 1 - Session 3-3: Test Generation and Mutation C Chair(s): Stefan Winter

Abstract

The black-box nature of the Deep Neural Network (DNN) makes it difficult for people to understand why it makes a specific decision, which restricts its applications in critical tasks. Recently, many interpreters (interpretation methods) are proposed to improve the transparency of DNNs by providing relevant features in the form of a saliency map. However, different interpreters might provide different interpretation results for the same classification case, which motivates us to conduct the robustness evaluation of interpreters.

However, the biggest challenge of evaluating interpreters is the testing oracle problem, i.e., hard to label ground-truth interpretation results. To fill this critical gap, we first use the images with bounding boxes in the object detection system and the images inserted with backdoor triggers as our original ground-truth dataset. Then, we apply metamorphic testing to extend the dataset by three operators, including inserting an object, deleting an object, and feature squeezing the image background. Our key intuition is that after the three operations which do not modify the primary detected objects, the interpretation results should not change for good interpreters. Finally, we measure the qualities of interpretation results quantitatively with the Intersection-over-Minimum (IoMin) score and evaluate interpreters based on the statistics of metamorphic relation’s failures.

We evaluate seven popular interpreters on 877,324 metamorphic images in diverse scenes. The results show that our approach can quantitatively evaluate interpreters’ robustness, where Grad-CAM provides the most reliable interpretation results among the seven interpreters.

DOI

https://doi.org/10.1145/3533767.3534225

Ming Fan

Xi'an Jiaotong University

Jiali Wei

Xi'an Jiaotong University

China

Wuxia Jin

Xi'an Jiaotong University

China

Zhou Xu

Wuhan University

Wenying Wei

Xi'an Jiaotong University

Ting Liu

Xi'an Jiaotong University

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 20 Jul
Displayed time zone: Seoul change

01:20 - 02:20	Session 1-2: Test Generation and Mutation ATechnical Papers at ISSTA 2 Chair(s): Raghavan Komondoor IISc Bengaluru

01:20 20m Talk		On the Use of Mutation Analysis For Evaluating Student Test Suite Quality Technical Papers James Perretta Northeastern University, Andrew DeOrio University of Michigan, Arjun Guha Northeastern University, Jonathan Bell Northeastern University DOI
01:40 20m Talk		Automated Test Generation for REST APIs: No Time to Rest Yet Technical Papers Myeongsoo Kim , Qi Xin Wuhan University, Saurabh Sinha IBM Research, Alessandro Orso Georgia Tech DOI
02:00 20m Talk		One Step Further: Evaluating Interpreters Using Metamorphic Testing Technical Papers Ming Fan Xi'an Jiaotong University, Jiali Wei Xi'an Jiaotong University, Wuxia Jin Xi'an Jiaotong University, Zhou Xu Wuhan University, Wenying Wei Xi'an Jiaotong University, Ting Liu Xi'an Jiaotong University DOI

18:00 - 19:00	Session 3-3: Test Generation and Mutation CTechnical Papers at ISSTA 1 Chair(s): Stefan Winter LMU Munich

18:00 20m Talk		One Step Further: Evaluating Interpreters Using Metamorphic Testing Technical Papers Ming Fan Xi'an Jiaotong University, Jiali Wei Xi'an Jiaotong University, Wuxia Jin Xi'an Jiaotong University, Zhou Xu Wuhan University, Wenying Wei Xi'an Jiaotong University, Ting Liu Xi'an Jiaotong University DOI
18:20 20m Talk		Test Mimicry to Assess the Exploitability of Library Vulnerabilities Technical Papers Hong Jin Kang Singapore Management University, Singapore, Truong Giang Nguyen School of Computing and Information Systems, Singapore Management University, Xuan Bach D. Le The University of Melbourne, Corina S. Pasareanu Carnegie Mellon University Silicon Valley, NASA Ames Research Center, David Lo Singapore Management University DOI
18:40 20m Talk		RegMiner: Towards Constructing a Large Regression Dataset from Code Evolution History Technical Papers Xuezhi Song Fudan University, Yun Lin National University of Singapore, Siang Hwee Ng National University of Singapore, Yijian Wu Fudan University, Xin Peng Fudan University, Jin Song Dong National University of Singapore, Hong Mei Peking University DOI Pre-print