ECG-Federated-Learning: Secure Federated ECG Diagnosis

ECG-Federated-Learning is a Python-based open-source project with 306 GitHub stars, developed under the TTEH Lab at Dayananda Sagar University's Department of Computer Science and Engineering (Cyber Security). It tackles privacy issues in ECG-based cardiac diagnosis by implementing federated learning, where multiple simulated hospitals train models on local data without sharing raw patient signals. A global model aggregates these local updates via the Flower framework, while SHAP provides explanations for predictions to build trust in clinical use.

The approach targets IoT-enabled smart hospitals, where ECG data sensitivity under regulations like HIPAA prevents centralized training. Traditional methods risk breaches by pooling data, but this system keeps training decentralized, evaluates against centralized baselines, and maintains accuracy for detecting abnormalities like arrhythmias.

Core Features

This project combines federated learning with explainable AI for ECG classification. Key elements include:

Federated learning pipeline using Flower, simulating multiple clients (hospitals) that train locally and send model updates to a central server.
SHAP integration for model interpretability, generating visualizations of feature importance in ECG predictions.
PyTorch support for high-fidelity ML models, tested in Python 3.11.
Dual evaluation in centralized and federated modes to compare privacy-preserving performance.
MIT-licensed code with badges highlighting Python, PyTorch, federated learning, and explainability.

These features address core needs in healthcare AI: data privacy, model transparency, and diagnostic reliability.

System Architecture

The architecture follows a distributed setup outlined in the README. Clients represent hospitals with private ECG datasets. Each runs local training on their data, using a shared model architecture. The Flower server orchestrates rounds of aggregation, typically via FedAvg (Federated Averaging).

Components include data loaders for ECG signals, client-side trainers, a central aggregator, and SHAP analyzers post-training. The README table (partially shown) maps elements like these to technologies: Flower for orchestration, PyTorch for models, and SHAP for explanations.

Simulated IoT environments mimic real smart hospitals, where devices stream ECG data without central storage. This preserves scalability for edge computing.

How It Works

Training starts with clients loading ECG datasets—public benchmarks like MIT-BIH are implied for reproducibility. Local models optimize on client data, perhaps using CNNs or RNNs suited to time-series ECG signals.

After local epochs, clients upload weight updates (not data) to the Flower server. The server averages them into a global model and redistributes it. Multiple rounds repeat until convergence.

Post-training, SHAP computes values like SHAP summary plots, showing which ECG waveform segments (e.g., QRS complex) drive predictions for classes like normal sinus rhythm or ventricular tachycardia.

The README details this in sections like "How It Works," "Core Modules," and "Code Architecture," covering data preprocessing, model definitions, and client-server scripts.

Getting It Running

The project targets Python 3.11 users familiar with ML environments. Clone the repository from https://github.com/PoorvikaN/ECG-Federated-Learning.

Setup follows the "Setup & Usage" section in the README. Create a virtual environment:

python -m venv fl_env
source fl_env/bin/activate  # On Linux/Mac
# or
fl_env\Scripts\activate  # On Windows

Install dependencies, primarily PyTorch, Flower (flower.readthedocs.io), SHAP, and ECG-specific libraries like NeuroKit2 or WFDB (inferred from context). A typical pip install might look like:

pip install torch torchvision torchaudio
pip install flwr[simulation]
pip install shap
pip install scikit-learn pandas numpy matplotlib

Run federated training via scripts in core modules, such as a client.py and server.py setup with Flower's simulation mode:

python -m flwr.simulation.app --run  # Adjust per README examples

For explainability, invoke SHAP after model loading. Datasets download automatically or via manual placement in a data/ directory. The README's "Implementation Results" shows expected outputs like accuracy metrics and SHAP plots.

Test on modest hardware—a GPU accelerates PyTorch, but CPU suffices for simulations with few clients.

Performance Evaluation

Evaluations compare federated to centralized training. Federated setups match centralized accuracy (exact figures in README's "Performance Evaluation" and "Implementation Results"), validating privacy without performance loss.

Metrics cover precision, recall, F1 for multi-class ECG abnormalities. Plots from results likely include confusion matrices and ROC curves.

SHAP analysis reveals consistent explanations across clients, proving model robustness despite data silos.

Explainability in Action

SHAP values highlight influential time points in ECG signals. For instance, a model's decision on atrial fibrillation might emphasize irregular R-R intervals. Force plots and dependence plots aid clinicians in verifying AI outputs.

This transparency suits regulatory scrutiny, as black-box models face resistance in medicine.

Who This Is For

Researchers in federated learning, healthcare AI, or cyber security will find value here, especially students at institutions like Dayananda Sagar University replicating the work. Developers building IoT health prototypes can adapt the Flower + PyTorch stack for edge devices.

Use cases include simulating multi-hospital collaborations for rare disease detection or prototyping privacy-compliant telemedicine. Keywords like "Federated Learning," "ECG Classification," and "Privacy-Preserving Machine Learning" guide searchers to it.

It's less ideal for production without custom scaling—simulations suit proof-of-concepts, not live patient streams.

Comparisons to Alternatives

Centralized tools like standard PyTorch ECG classifiers (e.g., on Kaggle datasets) lack privacy; this adds Flower overhead but preserves data locality.

Other federated frameworks exist: TensorFlow Federated for broader ML, or OpenFL for healthcare. This project's SHAP focus differentiates it for explainability needs.

Vs. pure XAI libs like LIME, it embeds explanations in a full FL pipeline. Heavier than lightweight ECG apps like HeartPy (Python ECG processor), due to FL simulation.

Limitations noted in README include simulation-only (no real devices), potential non-IID data challenges across clients, and compute demands for many rounds.

The source code and full results reside at https://github.com/PoorvikaN/ECG-Federated-Learning, offering a solid starting point for privacy-focused ECG AI experiments.

ECG-Federated-Learning: Privacy-Preserving ECG Diagnosis via Federated Learning

Core Features

System Architecture

How It Works

Getting It Running

Performance Evaluation

Explainability in Action

Who This Is For

Comparisons to Alternatives

Comments

Core Features

System Architecture

How It Works

Getting It Running

Performance Evaluation

Explainability in Action

Who This Is For

Comparisons to Alternatives

Comments

Related Posts

Stable Diffusion WebUI: self-hosted web interface for AI image generation

Byaan: AI data agent that learns your database to answer questions in plain English

ccstory: narrative summaries of Claude Code session logs

Pluck delivers token‑aware file retrieval for AI coding agents to cut costs and latency