REDII: Test Infrastructure to Enable Deterministic Reproduction of Failures for Distributed Systems (ICSE 2025 - Research Track)

Who

Yang Feng, Zheyuan Lin, Dongchen Zhao, Mengbo Zhou, Jia Liu, James Jones

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Apr 2025 12:00 - 12:15 at 205 - Testing and QA 1 Chair(s): Jonathan Bell

Abstract

Despite the fact that distributed systems have become a crucial aspect of modern technology and support many of the software systems that enable modern life, developers experience challenges in performing regression testing of these systems. Existing solutions for testing distributed systems are often either: (1) specialized testing environments that are created specifically for each system by its development team, which requires substantial effort for each team, with little-to-no sharing of this effort across teams; or (2) randomized injection tools that are often computationally expensive and offer no guarantees of preventing regressions, due to their randomness. The challenge of providing a generalized and practical solution to trigger bugs for reproducing and demonstrating failures, as well as to guard against regressions, is largely unaddressed.

In this work, we present REDII, an infrastructure for supporting regression testing of distributed systems. REDII contains a dataset of real bugs on common distributed systems, along with a generalizable testing framework REDIT that enables developers to write tests that can reproduce failures by providing ways to deterministically control distributed execution. In addition to the real failures in REDII from multiple distributed systems, REDIT provides a reusable, programmable, platform-agnostic, deterministic regression-testing framework for developers of distributed systems. It can help automate the running of such tests, for both practitioners and researchers. We demonstrate REDIT with 63 bugs that we selected in JIRA on 7 large and widely used distributed systems. Our case studies show that REDII can be used to allow developers to write tests that effectively reproduce bugs on distributed systems and generate specific scenarios for regression testing, as well as providing deterministic failure injection that can help developers and researchers to better understand deterministic failures that may occur in distributed systems in the future. Additionally, our studies show that REDII is efficient for real-world system regression testing, providing a powerful tool for all participants in this area.

Yang Feng

Nanjing University

China

Zheyuan Lin

Nanjing University

China

Dongchen Zhao

Nanjing University

Mengbo Zhou

Nanjing University

Jia Liu

Nanjing University

China

James Jones

University of California at Irvine

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 30 Apr
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	Testing and QA 1Research Track / Journal-first Papers at 205 Chair(s): Jonathan Bell Northeastern University

11:00 15m Talk		Critical Variable State-Aware Directed Greybox Fuzzing Research Track Xu Chen Institute of Information Engineering at Chinese Academy of Sciences, China / University of Chinese Academy of Sciences, China, Ningning Cui Institute of Information Engineering at Chinese Academy of Sciences, China / University of Chinese Academy of Sciences, China, Zhe Pan Institute of Information Engineering at Chinese Academy of Sciences, China / University of Chinese Academy of Sciences, China, Liwei Chen Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Gang Shi Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Meng Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences
11:15 15m Talk		LWDIFF: An LLM-Assisted Differential Testing Framework for WebAssembly Runtimes Research Track Shiyao Zhou The Hong Kong Polytechnic University, Jincheng Wang Hong Kong Polytechnic University, He Ye University College London (UCL), Hao Zhou The Hong Kong Polytechnic University, Claire Le Goues Carnegie Mellon University, Xiapu Luo Hong Kong Polytechnic University
11:30 15m Talk		No Harness, No Problem: Oracle-guided Harnessing for Auto-generating C API Fuzzing Harnesses Research Track Gabriel Sherman University of Utah, Stefan Nagy University of Utah
11:45 15m Talk		Parametric Falsification of Many Probabilistic Requirements under Flakiness Research Track Matteo Camilli Politecnico di Milano, Raffaela Mirandola Karlsruhe Institute of Technology (KIT)
12:00 15m Talk		REDII: Test Infrastructure to Enable Deterministic Reproduction of Failures for Distributed Systems Research Track Yang Feng Nanjing University, Zheyuan Lin Nanjing University, Dongchen Zhao Nanjing University, Mengbo Zhou Nanjing University, Jia Liu Nanjing University, James Jones University of California at Irvine
12:15 15m Talk		Adopting Automated Bug Assignment in Practice - A Longitudinal Case Study at Ericsson Journal-first Papers Markus Borg CodeScene, Leif Jonsson Ericsson AB, Emelie Engstrom Lund University, Béla Bartalos Verint, Attila Szabo Ericsson