EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries (ICSE 2022 - Technical Track)

Write a Blog >>

Sun 8 - Fri 27 May 2022

Who

Jiannan Wang, Thibaud Lutellier, Shangshu Qian, Hung Viet Pham, Lin Tan

Track

ICSE 2022 Technical Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 9 May 2022 20:05 - 20:10 at ICSE room 3 - Reliability and Safety 3 Chair(s): Antonio Filieri
Wed 11 May 2022 11:15 - 11:20 at ICSE room 2 - Performance and Reliability Chair(s): Andrea Zisman
Thu 26 May 2022 11:35 - 11:40 at Room 301+302 - Papers 15: Software Testing 2 Chair(s): Rohan Padhye

Abstract

Testing deep learning (DL) software is crucial and challenging. Recent approaches use differential testing to cross-check pairs of implementations of the same functionality across different libraries. While useful, such approaches require two independent DL libraries implementing the same function, which is often unavailable. In addition, they rely on a high-level library, Keras, that implements missing functions in all supported DL libraries which is prohibitively expensive and thus no longer maintained.

To address this issue, we propose EAGLE, a new technique that uses differential testing at a different dimension, by using equivalent graphs to test a single DL implementation(e.g., a single DL library). Equivalent graphs use different APIs, data types, or optimizations to achieve the same functionality. The rationale is that two equivalent graphs executed on a single DL implementation should produce identical output given the same input. Specifically, we develop 17 new DL equivalence rules, and propose a technique, EAGLE, that (1) uses these equivalence rules to build concrete pairs of equivalent graphs and (2) cross-checks the output of these equivalent graphs to detect inconsistency bugs in a DL library.

Our evaluation on two widely-used DL libraries (TensorFlow and PyTorch) shows that EAGLE detects 20 bugs (12 in TensorFlow and 8 in PyTorch), including 9 previously unknown bugs.

Link to Preprint

https://jiannanwang.github.io/files/eagle-icse22.pdf

Jiannan Wang

Purdue University

Thibaud Lutellier

University of Waterloo

Shangshu Qian

Purdue University

Hung Viet Pham

University of Waterloo

Canada

Lin Tan

Purdue University

United States

EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 9 May
Displayed time zone: Eastern Time (US & Canada) change

20:00 - 21:00	Reliability and Safety 3Technical Track at ICSE room 3 Chair(s): Antonio Filieri Imperial College London

5m Talk		Promal: Precise Window Transition Graphs for Android via Synergy of Program Analysis and Machine Learning Technical Track Changlin Liu Case Western Reserve University, Hanlin Wang Case Western Reserve University, Tianming Liu Monash Univerisity, Diandian Gu Peking University, Yun Ma Peking University, Haoyu Wang Huazhong University of Science and Technology, China, Xusheng Xiao Case Western Reserve University DOI Pre-print Media Attached
5m Talk		EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached
5m Talk		DeepTraLog: Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning Technical Track Chenxi Zhang Fudan University, Xin Peng Fudan University, Chaofeng Sha Fudan University, Ke Zhang Fudan University, Zhenqing Fu Fudan University, Xiya Wu Fudan University, Qingwei Lin Microsoft Research, Dongmei Zhang Microsoft Research Pre-print Media Attached
5m Talk		Repairing Brain-Computer Interfaces with Fault-based Data Acquisition Technical Track Cailin Winston University of Washington, Caleb Winston University of Washington, Chloe N Winston University of Washington, Claris Winston University of Washington, Cleah Winston , Rajesh PN Rao University of Washington, René Just University of Washington Pre-print Media Attached
5m Talk		PReach: A Heuristic for Probabilistic Reachability to Identify Hard to Reach Statements Technical Track Seemanta Saha University of California Santa Barbara, Mara Downing University of California, Santa Barbara, Tegan Brennan , Tevfik Bultan University of California, Santa Barbara Pre-print Media Attached

Wed 11 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:00	Performance and ReliabilityTechnical Track / Journal-First Papers at ICSE room 2 Chair(s): Andrea Zisman The Open University

5m Talk		Predicting unstable software benchmarks using static source code features Journal-First Papers Christoph Laaber Simula Research Laboratory, Mikael Basmaci University of Zurich, Pasquale Salza University of Zurich Link to publication DOI Media Attached
5m Talk		Evaluating the impact of falsely detected performance bug-inducing changes in JIT models Journal-First Papers Sophia Quach Concordia University, Maxime Lamothe Polytechnique Montréal, Bram Adams Queens University, Yasutaka Kamei Kyushu University, Weiyi Shang Concordia University Link to publication DOI Pre-print Media Attached
5m Talk		Using Reinforcement Learning for Load Testing of Video Games Technical Track Rosalia Tufano Università della Svizzera Italiana, Simone Scalabrino University of Molise, Luca Pascarella Università della Svizzera italiana (USI), Emad Aghajani Software Institute, USI Università della Svizzera italiana, Rocco Oliveto University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached
5m Talk		EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached
5m Talk		Decomposing Software Verification into Off-the-Shelf Components: An Application to CEGAR Technical Track Dirk Beyer LMU Munich, Germany, Jan Haltermann University of Oldenburg, Thomas Lemberger LMU Munich, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg Pre-print Media Attached
5m Talk		Precise Divide-By-Zero Detection with Affirmative Evidence Technical Track Yiyuan Guo The Hong Kong University of Science and Technology, Ant Group, Jinguo Zhou Ant Group, Peisen Yao The Hong Kong University of Science and Technology, Qingkai Shi Ant Group, Charles Zhang Hong Kong University of Science and Technology DOI Pre-print Media Attached

Thu 26 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	Papers 15: Software Testing 2Technical Track / SEIP - Software Engineering in Practice at Room 301+302 Chair(s): Rohan Padhye Carnegie Mellon University

11:00 5m Talk		CONFETTI: Amplifying Concolic Guidance for Fuzzers Technical Track James Kukucka George Mason University, Luís Pina University of Illinois at Chicago, Paul Ammann George Mason University, USA, Jonathan Bell Northeastern University Pre-print Media Attached
11:05 5m Talk		Surveying the Developer Experience of Flaky Tests SEIP - Software Engineering in Practice Owain Parry The University of Sheffield, Gregory Kapfhammer Allegheny College, Michael Hilton Carnegie Mellon University, USA, Phil McMinn University of Sheffield Pre-print Media Attached
11:10 5m Talk		Natural Attack for Pre-trained Models of Code Technical Track Zhou Yang Singapore Management University, Jieke Shi Singapore Management University, Junda He Singapore Management University, David Lo Singapore Management University DOI Pre-print Media Attached
11:15 5m Talk		FADATest: Fast and Adaptive Performance Regression Testing of Dynamic Binary Translation Systems Technical Track Jin Wu Harbin Institute of Technology, Jian Dong Harbin Institute Of Technology, Ruili Fang University of Georgia, Wen Zhang University of Georgia, Wenwen Wang University of Georgia, Decheng Zuo Harbin Institute of Technology Pre-print Media Attached
11:20 5m Talk		Repairing Order-Dependent Flaky Tests via Test Generation Technical Track Chengpeng Li University of Texas at Austin, Chenguang Zhu University of Texas at Austin, Wenxi Wang University of Texas at Austin, August Shi University of Texas at Austin Link to publication DOI Media Attached
11:25 5m Talk		BeDivFuzz: Integrating Behavioral Diversity into Generator-based Fuzzing Technical Track Hoang Lam Nguyen Humboldt-Universität zu Berlin, Lars Grunske Humboldt-Universität zu Berlin Pre-print Media Attached
11:30 5m Talk		Nessie: Automatically Testing JavaScript APIs with Asynchronous Callbacks Technical Track Ellen Arteca Northeastern University, Sebastian Harner University of Stuttgart, Michael Pradel University of Stuttgart, Frank Tip Northeastern University Pre-print Media Attached
11:35 5m Talk		EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries Technical Track Jiannan Wang Purdue University, Thibaud Lutellier University of Waterloo, Shangshu Qian Purdue University, Hung Viet Pham University of Waterloo, Lin Tan Purdue University Pre-print Media Attached

Information for Participants

Mon 9 May 2022 20:00 - 21:00 at ICSE room 3 - Reliability and Safety 3 Chair(s): Antonio Filieri

Info for room ICSE room 3-even hours:

Click here to go to the room on Midspace

Wed 11 May 2022 11:00 - 12:00 at ICSE room 2 - Performance and Reliability Chair(s): Andrea Zisman

Info for room ICSE room 2-odd hours:

Click here to go to the room on Midspace