FaultFuzz: A Coverage Guided Fault Injection Tool for Distributed Systems
Distributed systems are expected to correctly recover from various faults, e.g., node crash / reboot and network disconnection / reconnection. However, faults that occur under special timing can trigger fault recovery bugs caused by incorrect fault recovery protocols and implementations. Existing random and brute-force fault injection approaches are not effective in revealing fault recovery bugs due to the combinatorial explosion of multiple faults in distributed systems.
In this paper, we propose FaultFuzz, a coverage guided fault injection approach that can systematically and effectively test fault recovery behaviors in distributed systems. Based on runtime feedbacks collected from distributed system testing, e.g., code coverage and I/O information, FaultFuzz generates possible combinations of faults, and preferentially selects the combinations that are more likely to trigger new fault recovery behaviors and reveal new fault recovery bugs. We have applied FaultFuzz on three widely-used distributed systems, i.e., Zookeeper, HDFS and HBase and found 5 bugs in them. A video demonstration of FaultFuzz is available at https://youtu.be/SMw1ZF1vyXw.
Wed 17 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Testing 2Research Track / Software Engineering Education and Training / Software Engineering in Practice / Demonstrations / Journal-first Papers at Eugénio de Andrade Chair(s): Jonathan Bell Northeastern University | ||
14:00 15mTalk | Ripples of a Mutation — An Empirical Study of Propagation Effects in Mutation Testing Research Track Hang Du University of California at Irvine, Vijay Krishna Palepu Microsoft, James Jones University of California at Irvine DOI | ||
14:15 15mTalk | Fast Deterministic Black-box Context-free Grammar Inference Research Track Mohammad Rifat Arefin The University of Texas at Arlington, Suraj Shetiya University of Texas at Arlington, Zili Wang Iowa State University, Christoph Csallner University of Texas at Arlington Pre-print Media Attached | ||
14:30 15mTalk | Bridging Theory to Practice in Software Testing Teaching through Team-based Learning (TBL) and Open Source Software (OSS) Contribution Software Engineering Education and Training | ||
14:45 15mTalk | Productive Coverage: Improving the Actionability of Code Coverage Software Engineering in Practice Marko Ivanković Google; Universität Passau, Goran Petrović Google Inc, Yana Kulizhskaya Google Inc, Mateusz Lewko Google Inc, Luka Kalinovčić No affiliation, René Just University of Washington, Gordon Fraser University of Passau | ||
15:00 15mTalk | Taming Timeout Flakiness: An Empirical Study of SAP HANA Software Engineering in Practice Pre-print | ||
15:15 7mTalk | Testing Abstractions for Cyber-Physical Control Systems Journal-first Papers Claudio Mandrioli University of Luxembourg, Max Nyberg Carlsson Lund University, Martina Maggio Saarland University, Germany / Lund University, Sweden Pre-print | ||
15:22 7mTalk | FaultFuzz: A Coverage Guided Fault Injection Tool for Distributed Systems Demonstrations Wenhan Feng Institute of Software, Chinese Academy of Sciences, Qiugen Pei Joint Laboratory on Cyberspace Security China Southern Power Grid, Yu Gao Institute of Software, Chinese Academy of Sciences, China, Dong Wang Institute of software, Chinese academy of sciences, Wensheng Dou Institute of Software Chinese Academy of Sciences, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences; University of Chinese Academy of Sciences Chongqing School, Zheheng Liang Joint Laboratory on Cyberspace Security of China Southern Power Grid, Zhenyue Long Joint Laboratory on Cyberspace Security China Southern Power Grid Pre-print |