Towards Exploring the Limitations of Test Selection Techniques on Graph Neural Networks: An Empirical Study
Graph Neural Networks (GNNs) have gained prominence in various domains, such as social network analysis, recommendation systems, and drug discovery, due to their ability to model complex relationships in graph-structured data. GNNs can exhibit incorrect behavior, resulting in severe consequences. Therefore, testing is necessary and pivotal. However, labeling all test inputs for GNNs can be prohibitively costly and time-consuming, especially when dealing with large and complex graphs. In response to these challenges, test selection has emerged as a strategic approach to alleviate labeling expenses. The objective of test selection is to select a subset of tests from the complete test set. While various test selection techniques have been proposed for traditional deep neural networks (DNNs), their adaptation to GNNs presents unique challenges due to the distinctions between DNN and GNN test data. Specifically, DNN test inputs are independent of each other, whereas GNN test inputs (nodes) exhibit intricate interdependencies. Therefore, it remains unclear whether DNN test selection approaches can perform effectively on GNNs. To fill the gap, we conduct an empirical study that systematically evaluates the effectiveness of various test selection methods in the context of GNNs, focusing on three critical aspects: 1) Misclassification detection: selecting test inputs that are more likely to be misclassified; 2) Accuracy estimation: selecting a small set of tests to precisely estimate the accuracy of the whole testing set; 3) Performance enhancement: selecting retraining inputs to improve the GNN accuracy. Our empirical study encompasses 7 graph datasets and 8 GNN models, evaluating 22 test selection approaches. Our study includes not only node classification datasets but also graph classification datasets. Our findings reveal that: 1) In GNN misclassification detection, confidence-based test selection methods, which perform well in DNNs, do not demonstrate the same level of effectiveness; 2) In terms of GNN accuracy estimation, clustering-based methods, while consistently performing better than random selection, provide only slight improvements; 3) Regarding selecting inputs for GNN performance improvement, test selection methods, such as confidence-based and clustering-based test selection methods, demonstrate only slight effectiveness; 4) Concerning performance enhancement, node importance-based test selection methods are not suitable, and in many cases, they even perform worse than random selection.
Tue 29 OctDisplayed time zone: Pacific Time (US & Canada) change
10:30 - 12:00 | Test selection and prioritizationResearch Papers / Journal-first Papers / NIER Track at Camellia Chair(s): Wing Lam George Mason University | ||
10:30 15mTalk | Towards Exploring the Limitations of Test Selection Techniques on Graph Neural Networks: An Empirical Study Journal-first Papers Xueqi Dang University of Luxembourg, SnT, Yinghua LI University of Luxembourg, Wei Ma Nanyang Technological University, Yuejun GUo Luxembourg Institute of Science and Technology, Qiang Hu The University of Tokyo, Mike Papadakis University of Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg Media Attached | ||
10:45 15mTalk | Prioritizing Test Cases for Deep Learning-based Video Classifiers Journal-first Papers Yinghua LI University of Luxembourg, Xueqi Dang University of Luxembourg, SnT, Lei Ma The University of Tokyo & University of Alberta, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg Media Attached | ||
11:00 15mTalk | Neuron Sensitivity Guided Test Case Selection Journal-first Papers Dong Huang The University of Hong Kong, Qingwen Bu Shanghai Jiao Tong University, Yichao FU The University of Hong Kong, Yuhao Qing University of Hong Kong, Xiaofei Xie Singapore Management University, Junjie Chen Tianjin University, Heming Cui University of Hong Kong | ||
11:15 15mTalk | FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection Research Papers Jialuo Chen Zhejiang University, Jingyi Wang Zhejiang University, Xiyue Zhang University of Oxford, Youcheng Sun University of Manchester, Marta Kwiatkowska University of Oxford, Jiming Chen Zhejiang University; Hangzhou Dianzi University, Peng Cheng Zhejiang University | ||
11:30 15mTalk | Hybrid Regression Test Selection by Integrating File and Method Dependences Research Papers Guofeng Zhang College of Computer, National University of Defense Technology, Luyao Liu College of Computer, National University of Defense Technology, Zhenbang Chen College of Computer, National University of Defense Technology, Ji Wang National University of Defense Technology DOI Pre-print | ||
11:45 10mTalk | Prioritizing Tests for Improved Runtime NIER Track Abdelrahman Baz The University of Texas at Austin, Minchao Huang The University of Texas at Austin, August Shi The University of Texas at Austin |