Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions

The rapidly expanding demand for intelligent wireless applications and the Internet of Things (IoT) requires advanced system designs to handle multimodal data effectively while ensuring user privacy and data security. Traditional machine learning (ML) models rely on centralized architectures, which,...

Full description

Saved in:
Bibliographic Details
Main Authors: Mumin Adam, Abdullatif Albaseer, Uthman Baroudi, Mohamed Abdallah
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Open Journal of the Communications Society
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10938626/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850183004469592064
author Mumin Adam
Abdullatif Albaseer
Uthman Baroudi
Mohamed Abdallah
author_facet Mumin Adam
Abdullatif Albaseer
Uthman Baroudi
Mohamed Abdallah
author_sort Mumin Adam
collection DOAJ
description The rapidly expanding demand for intelligent wireless applications and the Internet of Things (IoT) requires advanced system designs to handle multimodal data effectively while ensuring user privacy and data security. Traditional machine learning (ML) models rely on centralized architectures, which, while powerful, often present significant privacy risks due to the centralization of sensitive data. Federated Learning (FL) is a promising decentralized alternative for addressing these issues. However, FL predominantly handles unimodal data, which limits its applicability in environments where devices collect and process various data types such as text, images, and sensor output. To address this limitation, Multimodal FL (MMFL) integrates multiple data modalities, enabling a richer and more holistic understanding of data. In this survey, we explore the challenges and advancements in MMFL, including data representation, fusion techniques, and cross-modal learning strategies. We present a comprehensive taxonomy of MMFL, outlining critical challenges such as modality imbalance, fusion complexity, and security concerns. Additionally, we highlight the role of transformers in MMFL by leveraging their powerful attention mechanisms to process multimodal data in a federated setting. Finally, we discuss various applications of MMFL, including healthcare, human activity recognition, and emotion recognition, and propose future research directions for improving the scalability and robustness of MMFL systems in real-world scenarios.
format Article
id doaj-art-5eb5f669330e472ca5fc881311c6ba3a
institution OA Journals
issn 2644-125X
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Open Journal of the Communications Society
spelling doaj-art-5eb5f669330e472ca5fc881311c6ba3a2025-08-20T02:17:28ZengIEEEIEEE Open Journal of the Communications Society2644-125X2025-01-0162510253810.1109/OJCOMS.2025.355453710938626Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future DirectionsMumin Adam0https://orcid.org/0009-0002-3492-0318Abdullatif Albaseer1https://orcid.org/0000-0002-6886-6500Uthman Baroudi2https://orcid.org/0000-0002-1507-5713Mohamed Abdallah3https://orcid.org/0000-0002-3261-7588Department of Computer Engineering, King Fahd University of Petroleum and Minerals, Dhahran, Saudi ArabiaDivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarDepartment of Computer Engineering, King Fahd University of Petroleum and Minerals, Dhahran, Saudi ArabiaDivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarThe rapidly expanding demand for intelligent wireless applications and the Internet of Things (IoT) requires advanced system designs to handle multimodal data effectively while ensuring user privacy and data security. Traditional machine learning (ML) models rely on centralized architectures, which, while powerful, often present significant privacy risks due to the centralization of sensitive data. Federated Learning (FL) is a promising decentralized alternative for addressing these issues. However, FL predominantly handles unimodal data, which limits its applicability in environments where devices collect and process various data types such as text, images, and sensor output. To address this limitation, Multimodal FL (MMFL) integrates multiple data modalities, enabling a richer and more holistic understanding of data. In this survey, we explore the challenges and advancements in MMFL, including data representation, fusion techniques, and cross-modal learning strategies. We present a comprehensive taxonomy of MMFL, outlining critical challenges such as modality imbalance, fusion complexity, and security concerns. Additionally, we highlight the role of transformers in MMFL by leveraging their powerful attention mechanisms to process multimodal data in a federated setting. Finally, we discuss various applications of MMFL, including healthcare, human activity recognition, and emotion recognition, and propose future research directions for improving the scalability and robustness of MMFL systems in real-world scenarios.https://ieeexplore.ieee.org/document/10938626/Multimodal FLdata fusioncross-modalmultimodal federated transformer learningmultimodal FL communication intelligent IoT applications
spellingShingle Mumin Adam
Abdullatif Albaseer
Uthman Baroudi
Mohamed Abdallah
Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
IEEE Open Journal of the Communications Society
Multimodal FL
data fusion
cross-modal
multimodal federated transformer learning
multimodal FL communication intelligent IoT applications
title Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
title_full Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
title_fullStr Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
title_full_unstemmed Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
title_short Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
title_sort survey of multimodal federated learning exploring data integration challenges and future directions
topic Multimodal FL
data fusion
cross-modal
multimodal federated transformer learning
multimodal FL communication intelligent IoT applications
url https://ieeexplore.ieee.org/document/10938626/
work_keys_str_mv AT muminadam surveyofmultimodalfederatedlearningexploringdataintegrationchallengesandfuturedirections
AT abdullatifalbaseer surveyofmultimodalfederatedlearningexploringdataintegrationchallengesandfuturedirections
AT uthmanbaroudi surveyofmultimodalfederatedlearningexploringdataintegrationchallengesandfuturedirections
AT mohamedabdallah surveyofmultimodalfederatedlearningexploringdataintegrationchallengesandfuturedirections