EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries
Wed 11 May 2022 11:15 - 11:20 at ICSE room 2-odd hours - Performance and Reliability Chair(s): Andrea Zisman
Thu 26 May 2022 11:35 - 11:40 at Room 301+302 - Papers 15: Software Testing 2 Chair(s): Rohan Padhye
Testing deep learning (DL) software is crucial and challenging. Recent approaches use differential testing to cross-check pairs of implementations of the same functionality across different libraries. While useful, such approaches require two independent DL libraries implementing the same function, which is often unavailable. In addition, they rely on a high-level library, Keras, that implements missing functions in all supported DL libraries which is prohibitively expensive and thus no longer maintained.
To address this issue, we propose EAGLE, a new technique that uses differential testing at a different dimension, by using equivalent graphs to test a single DL implementation(e.g., a single DL library). Equivalent graphs use different APIs, data types, or optimizations to achieve the same functionality. The rationale is that two equivalent graphs executed on a single DL implementation should produce identical output given the same input. Specifically, we develop 17 new DL equivalence rules, and propose a technique, EAGLE, that (1) uses these equivalence rules to build concrete pairs of equivalent graphs and (2) cross-checks the output of these equivalent graphs to detect inconsistency bugs in a DL library.
Our evaluation on two widely-used DL libraries (TensorFlow and PyTorch) shows that EAGLE detects 20 bugs (12 in TensorFlow and 8 in PyTorch), including 9 previously unknown bugs.
Mon 9 MayDisplayed time zone: Eastern Time (US & Canada) change
20:00 - 21:00 | Reliability and Safety 3Technical Track at ICSE room 3-even hours Chair(s): Antonio Filieri Imperial College London | ||
20:00 5mTalk | Promal: Precise Window Transition Graphs for Android via Synergy of Program Analysis and Machine Learning Technical Track Changlin Liu Case Western Reserve University, Hanlin Wang Case Western Reserve University, Tianming Liu Monash Univerisity, Diandian Gu Peking University, Yun Ma Peking University, Haoyu Wang Huazhong University of Science and Technology, China, Xusheng Xiao Case Western Reserve University DOI Pre-print Media Attached | ||
20:05 5mTalk | EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached | ||
20:10 5mTalk | DeepTraLog: Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning Technical Track Chenxi Zhang Fudan University, Xin Peng Fudan University, Chaofeng Sha Fudan University, Ke Zhang Fudan University, Zhenqing Fu Fudan University, Xiya Wu Fudan University, Qingwei Lin Microsoft Research, Dongmei Zhang Microsoft Research Pre-print Media Attached | ||
20:15 5mTalk | Repairing Brain-Computer Interfaces with Fault-based Data Acquisition Technical Track Cailin Winston University of Washington, Caleb Winston University of Washington, Chloe N Winston University of Washington, Claris Winston University of Washington, Cleah Winston , Rajesh PN Rao University of Washington, René Just University of Washington Pre-print Media Attached | ||
20:20 5mTalk | PReach: A Heuristic for Probabilistic Reachability to Identify Hard to Reach Statements Technical Track Seemanta Saha University of California Santa Barbara, Mara Downing University of California, Santa Barbara, Tegan Brennan , Tevfik Bultan University of California, Santa Barbara Pre-print Media Attached |
Wed 11 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:00 | Performance and ReliabilityTechnical Track / Journal-First Papers at ICSE room 2-odd hours Chair(s): Andrea Zisman The Open University | ||
11:00 5mTalk | Predicting unstable software benchmarks using static source code features Journal-First Papers Christoph Laaber Simula Research Laboratory, Mikael Basmaci University of Zurich, Pasquale Salza University of Zurich Link to publication DOI Media Attached | ||
11:05 5mTalk | Evaluating the impact of falsely detected performance bug-inducing changes in JIT models Journal-First Papers Sophia Quach Concordia University, Maxime Lamothe Polytechnique Montréal, Bram Adams Queens University, Yasutaka Kamei Kyushu University, Weiyi Shang Concordia University Link to publication DOI Pre-print Media Attached | ||
11:10 5mTalk | Using Reinforcement Learning for Load Testing of Video Games Technical Track Rosalia Tufano Università della Svizzera Italiana, Simone Scalabrino University of Molise, Luca Pascarella Università della Svizzera italiana (USI), Emad Aghajani Software Institute, USI Università della Svizzera italiana, Rocco Oliveto University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
11:15 5mTalk | EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached | ||
11:20 5mTalk | Decomposing Software Verification into Off-the-Shelf Components: An Application to CEGAR Technical Track Dirk Beyer LMU Munich, Germany, Jan Haltermann University of Oldenburg, Thomas Lemberger LMU Munich, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg Pre-print Media Attached | ||
11:25 5mTalk | Precise Divide-By-Zero Detection with Affirmative Evidence Technical Track Yiyuan Guo The Hong Kong University of Science and Technology, Ant Group, Jinguo Zhou Ant Group, Peisen Yao The Hong Kong University of Science and Technology, Qingkai Shi Ant Group, Charles Zhang Hong Kong University of Science and Technology DOI Pre-print Media Attached |
Thu 26 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | Papers 15: Software Testing 2Technical Track / SEIP - Software Engineering in Practice at Room 301+302 Chair(s): Rohan Padhye Carnegie Mellon University | ||
11:00 5mTalk | CONFETTI: Amplifying Concolic Guidance for Fuzzers Technical Track James Kukucka George Mason University, Luís Pina University of Illinois at Chicago, Paul Ammann George Mason University, USA, Jonathan Bell Northeastern University Pre-print Media Attached | ||
11:05 5mTalk | Surveying the Developer Experience of Flaky Tests SEIP - Software Engineering in Practice Owain Parry The University of Sheffield, Gregory Kapfhammer Allegheny College, Michael Hilton Carnegie Mellon University, USA, Phil McMinn University of Sheffield Pre-print Media Attached | ||
11:10 5mTalk | Natural Attack for Pre-trained Models of Code Technical Track Zhou Yang Singapore Management University, Jieke Shi Singapore Management University, Junda He Singapore Management University, David Lo Singapore Management University DOI Pre-print Media Attached | ||
11:15 5mTalk | FADATest: Fast and Adaptive Performance Regression Testing of Dynamic Binary Translation Systems Technical Track Jin Wu Harbin Institute of Technology, Jian Dong Harbin Institute Of Technology, Ruili Fang University of Georgia, Wen Zhang University of Georgia, Wenwen Wang University of Georgia, Decheng Zuo Harbin Institute of Technology Pre-print Media Attached | ||
11:20 5mTalk | Repairing Order-Dependent Flaky Tests via Test Generation Technical Track Chengpeng Li University of Texas at Austin, Chenguang Zhu University of Texas at Austin, Wenxi Wang University of Texas at Austin, August Shi University of Texas at Austin Link to publication DOI Media Attached | ||
11:25 5mTalk | BeDivFuzz: Integrating Behavioral Diversity into Generator-based Fuzzing Technical Track Pre-print Media Attached | ||
11:30 5mTalk | Nessie: Automatically Testing JavaScript APIs with Asynchronous Callbacks Technical Track Ellen Arteca Northeastern University, Sebastian Harner University of Stuttgart, Michael Pradel University of Stuttgart, Frank Tip Northeastern University Pre-print Media Attached | ||
11:35 5mTalk | EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached |