A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation
Abstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for s...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SpringerOpen
2025-05-01
|
| Series: | Journal of Big Data |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s40537-025-01169-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849312139478564864 |
|---|---|
| author | Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas |
| author_facet | Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas |
| author_sort | Rahul Haripriya |
| collection | DOAJ |
| description | Abstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for secure, privacy-preserving solutions is more urgent than ever. This study explores a federated learning approach combined with transfer learning to enhance privacy in medical image classification using ResNet and VGG16 architectures. Pre-trained on ImageNet and fine tuned on three specialized medical datasets TB chest X-rays, brain tumor MRI scans, and diabetic retinopathy images these models were deployed in a simulated multi-center healthcare environment. A major contribution of this work is the development of an adaptive aggregation methodology, which dynamically selects between Federated Averaging (FedAvg) and Federated Stochastic Gradient Descent (FedSGD) based on real-time data divergence observed across participating clients. Unlike conventional static aggregation methods, which uniformly apply the same update rule regardless of data heterogeneity, the proposed adaptive approach monitors gradients and data distributions at each communication round and selects the most suitable aggregation method dynamically. This adaptive strategy not only improves convergence but also optimizes resource utilization, making it suitable for multi-center healthcare networks where data heterogeneity is prevalent. The novelty of the proposed adaptive aggregation lies in its ability to maintain robust performance while minimizing computational costs, making it feasible for large-scale healthcare AI networks, such as hospital federated learning systems. Comparative analysis with baseline FL models, including FedAvg and FedSGD, shows that the adaptive aggregation method achieves comparable accuracy (up to 96.3%) while significantly reducing execution time by approximately 20% and maintaining a competitive F1-score. Additionally, the integration of privacy-preserving techniques ensures that sensitive patient data remains secure throughout the learning process. By integrating transfer learning with federated learning, this study presents a scalable and privacy-preserving framework for Big Data analytics in healthcare. The findings underscore the potential of adaptive aggregation to enhance federated learning efficiency across heterogeneous datasets, enabling medical institutions to develop high-accuracy diagnostic models without direct access to patient data. |
| format | Article |
| id | doaj-art-9fc4e916e90d4b86897a5b2ef1d3ec39 |
| institution | Kabale University |
| issn | 2196-1115 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | SpringerOpen |
| record_format | Article |
| series | Journal of Big Data |
| spelling | doaj-art-9fc4e916e90d4b86897a5b2ef1d3ec392025-08-20T03:53:12ZengSpringerOpenJournal of Big Data2196-11152025-05-0112115610.1186/s40537-025-01169-8A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregationRahul Haripriya0Nilay Khare1Manish Pandey2Sreemoyee Biswas3Department of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyDepartment of Computer Science and Engineering, Maulana Azad National Institute of TechnologyAbstract The exponential growth of Big Data in healthcare, particularly in AI-driven medical diagnostics, has raised critical concerns about data privacy in medical image classification. With over 30% of healthcare organizations worldwide experiencing data breaches in the past year, the demand for secure, privacy-preserving solutions is more urgent than ever. This study explores a federated learning approach combined with transfer learning to enhance privacy in medical image classification using ResNet and VGG16 architectures. Pre-trained on ImageNet and fine tuned on three specialized medical datasets TB chest X-rays, brain tumor MRI scans, and diabetic retinopathy images these models were deployed in a simulated multi-center healthcare environment. A major contribution of this work is the development of an adaptive aggregation methodology, which dynamically selects between Federated Averaging (FedAvg) and Federated Stochastic Gradient Descent (FedSGD) based on real-time data divergence observed across participating clients. Unlike conventional static aggregation methods, which uniformly apply the same update rule regardless of data heterogeneity, the proposed adaptive approach monitors gradients and data distributions at each communication round and selects the most suitable aggregation method dynamically. This adaptive strategy not only improves convergence but also optimizes resource utilization, making it suitable for multi-center healthcare networks where data heterogeneity is prevalent. The novelty of the proposed adaptive aggregation lies in its ability to maintain robust performance while minimizing computational costs, making it feasible for large-scale healthcare AI networks, such as hospital federated learning systems. Comparative analysis with baseline FL models, including FedAvg and FedSGD, shows that the adaptive aggregation method achieves comparable accuracy (up to 96.3%) while significantly reducing execution time by approximately 20% and maintaining a competitive F1-score. Additionally, the integration of privacy-preserving techniques ensures that sensitive patient data remains secure throughout the learning process. By integrating transfer learning with federated learning, this study presents a scalable and privacy-preserving framework for Big Data analytics in healthcare. The findings underscore the potential of adaptive aggregation to enhance federated learning efficiency across heterogeneous datasets, enabling medical institutions to develop high-accuracy diagnostic models without direct access to patient data.https://doi.org/10.1186/s40537-025-01169-8Federated learningBig DataMachine learningArtificial intelligenceData privacyTransfer learning |
| spellingShingle | Rahul Haripriya Nilay Khare Manish Pandey Sreemoyee Biswas A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation Journal of Big Data Federated learning Big Data Machine learning Artificial intelligence Data privacy Transfer learning |
| title | A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation |
| title_full | A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation |
| title_fullStr | A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation |
| title_full_unstemmed | A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation |
| title_short | A privacy-enhanced framework for collaborative Big Data analysis in healthcare using adaptive federated learning aggregation |
| title_sort | privacy enhanced framework for collaborative big data analysis in healthcare using adaptive federated learning aggregation |
| topic | Federated learning Big Data Machine learning Artificial intelligence Data privacy Transfer learning |
| url | https://doi.org/10.1186/s40537-025-01169-8 |
| work_keys_str_mv | AT rahulharipriya aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT nilaykhare aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT manishpandey aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT sreemoyeebiswas aprivacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT rahulharipriya privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT nilaykhare privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT manishpandey privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation AT sreemoyeebiswas privacyenhancedframeworkforcollaborativebigdataanalysisinhealthcareusingadaptivefederatedlearningaggregation |