FedDebug: Systematic Debugging for Federated Learning Applications (ICSE 2023 - Artifact Evaluation)

Sun 14 - Sat 20 May 2023 Melbourne, Australia

Who

Waris Gill, Ali Anwar, Muhammad Ali Gulzar

Track

ICSE 2023 Artifact Evaluation

Abstract

Purpose of Research Artifact

This research artifact aims to present and demonstrate the operations of FedDebug, a two-pronged debugging solution that enables both interactive debugging and automated fault localization in Federation Learning (FL) applications.

Federated Learning (FL) is a privacy-preserving distributed machine learning technique. It enables individual clients (e.g., user participants, edge devices, or organizations) to train an ML model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. However, FL faces challenges in maintaining the integrity of the global model, as a faulty client update (e.g., a local model trained on noisy data) can deteriorate its performance. Additionally, as the client’s data is unavailable at the aggregator, it is difficult for developers to locate a faulty client. Furthermore, current methods for debugging FL applications do not provide interactive debugging without halting the FL training process. With the advent of frameworks like Tensorflow Federated, Flower, PaddleFL, PySyft, and IBMFL, FL is actively solving real-world problems.

Tool Artifact: FedDebug’s tool artifact comprises a two-step process, starting with interactive debugging which is followed by an automated fault localization process.

FedDebug’s interactive debugging module takes inspiration from traditional debuggers, such as gdb, and enables real-time interactive debugging on a simulation of a live FL application. A developer can spawn a simulation of a live FL application using its breakpoint and inspect the current state containing information such as clients’ models and their reported metrics (e.g., their training loss or hyperparameters). With step in, step out, step back, and step next, it allows a seamless transition between the rounds and clients at a given breakpoint, enabling a fine-grained step-by-step inspection of the application’s state. At any point, FedDebug allows a developer to remove a client, reaggregate the global model using a specific subset of clients, and resume training. The execution instructions are provided in the FedDebug/readme.md. The directory FedDebug/debugging-constructs contains the Dockerfile to build the Docker image and interact with FedDebug’s debugging constructs.
When a developer finds a suspicious state (e.g., multiple clients report high training loss), FedDebug’s automated fault localization module precisely identifies the faulty client without ever needing any real-world test data. The user can configure several fault localization scenarios in the file FedDebug/fault-localization/artifact.ipynb by varying the number of clients, datasets, data distribution, number of faulty clients, etc.

With FedDebug, we pave the way for advanced software debugging techniques to be adapted in the emerging area of federated learning and the broader community of machine learning practitioners. The FedDebug artifact is available at https://zenodo.org/badge/latestdoi/584879212 and https://github.com/seed-vt/FedDebug .

Badges

In this submission, we present the artifacts of FedDebug to receive the badges of Functional, Reusable, and Available. We have provided clear documentation for setting up and using FedDebug for anyone. Not only is the code functional, but we have also made extra efforts to make the results of FedDebug easily reproducible online. We generalize FedDebug and make it available for anyone running it on Google-Colab with Jupyter Notebooks to assist users who do not have access to a GPU to facilitate reuse and re-purposing. Furthermore, for easier deployment and installation, we have ported all FedDebug features (that are independent of the underlying FL framework) to Google Colab for plug-and-play deployments aimed at broader research and commercial uses. The FedDebug/fault-localization/artifact.ipynb notebook provides a single interface for testing and reproducing the results for verification and validation purposes.

Technology Skills

In order to evaluate the artifacts of FedDebug, it is recommended that the user have a basic understanding of Python, Google Colab, and Docker. To evaluate the artifact on a local machine, knowledge of Jupyter Notebooks, Unix OS, and Python virtual environments (e.g., Anaconda) is required. However, familiarity with Google Colab is enough to engage with FedDebug’s fault localization module. Access to either an Nvidia GPU or Google Colab is also required. The package can be installed using conda and pip. Additional information regarding the simulation environment can be found in Section 5 of the accompanying research paper.

It should be noted that the experimental evaluations presented in the paper require training of approximately 3491 individual models, which took several days on a top-of-the-line GPU (NVIDIA Tesla T4). However, for the purposes of artifact evaluation, we have selected representative settings that reflect our experiments and can be evaluated within a few hours on Google Colab. If access to such a GPU is available, the provided artifact allows replicating the configurations of the experiments mentioned in the attached paper. More information on reconfiguring FedDebug is available in the attached artifact FedDebug/fault-localization/artifact.ipynb and FedDebug/readme.md. Additionally, we provide a docker image (Dockerfile) to test the functionality of FedDebug’s interactive debugging in the IBMFL library.