Edge-Based Detection of Label Flipping Attacks in Federated Learning Using Explainable AI (SVM 2025)

Who

Nourah Alotaibi, Muhamad Felemban, Sajjad Mahmood

Track

SVM 2025 Software Vulnerability Management

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 3 May 2025 14:20 - 14:40 at 204 - Paper Session 2 Chair(s): Ziyang Ye

Abstract

Federated Learning (FL) is a decentralized machine learning approach that enables collaborative training among distributed clients while preserving data privacy, making it increasingly popular for privacy-sensitive applications over traditional centralized models. However, it introduces new security vulnerabilities that challenge conventional approaches to software vulnerability management. Among these, label flipping attacks (LFAs)—where malicious clients intentionally mislabel data—pose a unique threat to the integrity of FL models. This study presents an AI-driven, edge-based vulnerability detection technique, leveraging explainable AI (XAI) techniques to enhance edge-based security within FL environments. Our method combines Grad-CAM visualizations with DBSCAN clustering to analyze class-specific behavior across clients. By detecting anomalies in Grad-CAM activation patterns, we identify malicious clients with flipped class labels, exploiting patterns in their Grad-CAM heatmaps. This approach is particularly robust against LFAs, examining each class independently and capturing patterns without relying on global model behavior. Empirical results on benchmark datasets such as MNIST and FashionMNIST demonstrate that our method accurately detects LFAs, even when malicious clients constitute a substantial portion of the network. This class-specific, XAI-driven approach contributes to the security of FL by offering an explainable, and scalable solution for managing vulnerabilities in distributed AI systems.

Nourah Alotaibi

KFUPM

Muhamad Felemban

KFUPM

Sajjad Mahmood

King Fahd University of Petroleum & Minerals

Saudi Arabia