A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation

Abstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Rahul Haripriya, Nilay Khare, Manish Pandey, Sreemoyee Biswas
Format:	Article
Language:	English
Published:	SpringerOpen 2025-05-01
Series:	Journal of Big Data
Subjects:	Federated learning Big Data Machine learning Artificial intelligence Data privacy Transfer learning
Online Access:	https://doi.org/10.1186/s40537-025-01169-8
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849312139478564864
author	Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas
author_facet	Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas
author_sort	Rahul Haripriya
collection	DOAJ
description	Abstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for secure, privacy-preserving solutions is more urgent than ever. This study explores a federated learning approach combined with transfer learning to enhance privacy in medical image classification using ResNet and VGG16 architectures. Pre-trained on ImageNet and fine tuned on three specialized medical datasets TB chest X-rays, brain tumor MRI scans, and diabetic retinopathy images these models were deployed in a simulated multi-center healthcare environment. A major contribution of this work is the development of an adaptive aggregation methodology, which dynamically selects between Federated Averaging (FedAvg) and Federated Stochastic Gradient Descent (FedSGD) based on real-time data divergence observed across participating clients. Unlike conventional static aggregation methods, which uniformly apply the same update rule regardless of data heterogeneity, the proposed adaptive approach monitors gradients and data distributions at each communication round and selects the most suitable aggregation method dynamically. This adaptive strategy not only improves convergence but also optimizes resource utilization, making it suitable for multi-center healthcare networks where data heterogeneity is prevalent. The novelty of the proposed adaptive aggregation lies in its ability to maintain robust performance while minimizing computational costs, making it feasible for large-scale healthcare AI networks, such as hospital federated learning systems. Comparative analysis with baseline FL models, including FedAvg and FedSGD, shows that the adaptive aggregation method achieves comparable accuracy (up to 96.3%) while significantly reducing execution time by approximately 20% and maintaining a competitive F1-score. Additionally, the integration of privacy-preserving techniques ensures that sensitive patient data remains secure throughout the learning process. By integrating transfer learning with federated learning, this study presents a scalable and privacy-preserving framework for Big Data analytics in healthcare. The findings underscore the potential of adaptive aggregation to enhance federated learning efficiency across heterogeneous datasets, enabling medical institutions to develop high-accuracy diagnostic models without direct access to patient data.
format	Article
id	doaj-art-9fc4e916e90d4b86897a5b2ef1d3ec39
institution	Kabale University
issn	2196-1115
language	English
publishDate	2025-05-01
publisher	SpringerOpen
record_format	Article
series	Journal of Big Data
spelling	doaj-art-9fc4e916e90d4b86897a5b2ef1d3ec392025-08-20T03:53:12ZengSpringerOpenJournal of Big Data2196-11152025-05-0112115610.1186/s40537-025-01169-8A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregationRahul Haripriya0Nilay Khare1Manish Pandey2Sreemoyee Biswas3Department of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyAbstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for secure, privacy-preserving solutions is more urgent than ever. This study explores a federated learning approach combined with transfer learning to enhance privacy in medical image classification using ResNet and VGG16 architectures. Pre-trained on ImageNet and fine tuned on three specialized medical datasets TB chest X-rays, brain tumor MRI scans, and diabetic retinopathy images these models were deployed in a simulated multi-center healthcare environment. A major contribution of this work is the development of an adaptive aggregation methodology, which dynamically selects between Federated Averaging (FedAvg) and Federated Stochastic Gradient Descent (FedSGD) based on real-time data divergence observed across participating clients. Unlike conventional static aggregation methods, which uniformly apply the same update rule regardless of data heterogeneity, the proposed adaptive approach monitors gradients and data distributions at each communication round and selects the most suitable aggregation method dynamically. This adaptive strategy not only improves convergence but also optimizes resource utilization, making it suitable for multi-center healthcare networks where data heterogeneity is prevalent. The novelty of the proposed adaptive aggregation lies in its ability to maintain robust performance while minimizing computational costs, making it feasible for large-scale healthcare AI networks, such as hospital federated learning systems. Comparative analysis with baseline FL models, including FedAvg and FedSGD, shows that the adaptive aggregation method achieves comparable accuracy (up to 96.3%) while significantly reducing execution time by approximately 20% and maintaining a competitive F1-score. Additionally, the integration of privacy-preserving techniques ensures that sensitive patient data remains secure throughout the learning process. By integrating transfer learning with federated learning, this study presents a scalable and privacy-preserving framework for Big Data analytics in healthcare. The findings underscore the potential of adaptive aggregation to enhance federated learning efficiency across heterogeneous datasets, enabling medical institutions to develop high-accuracy diagnostic models without direct access to patient data.https://doi.org/10.1186/s40537-025-01169-8Federated learningBig DataMachine learningArtificial intelligenceData privacyTransfer learning
spellingShingle	Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation Journal of Big Data Federated learning Big Data Machine learning Artificial intelligence Data privacy Transfer learning
title	A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
title_full	A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
title_fullStr	A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
title_full_unstemmed	A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
title_short	A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
title_sort	privacy enhanced framework for collaborative big data analysis in healthcare using adaptive federated learning aggregation
topic	Federated learning Big Data Machine learning Artificial intelligence Data privacy Transfer learning
url	https://doi.org/10.1186/s40537-025-01169-8
work_keys_str_mv	AT rahulharipriya aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT nilaykhare aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT manishpandey aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT sreemoyeebiswas aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT rahulharipriya privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT nilaykhare privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT manishpandey privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT sreemoyeebiswas privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation

A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation

Similar Items