Enhanced Cloud Detection Using a Unified Multimodal Data Fusion Approach in Remote Images
Aiming at the complexity of network architecture design and the low computational efficiency caused by variations in the number of modalities in multimodal cloud detection tasks, this paper proposes an efficient and unified multimodal cloud detection model, M2Cloud, which can process any number of m...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/9/2684 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Aiming at the complexity of network architecture design and the low computational efficiency caused by variations in the number of modalities in multimodal cloud detection tasks, this paper proposes an efficient and unified multimodal cloud detection model, M2Cloud, which can process any number of modal data. The core innovation of M2Cloud lies in its novel multimodal data fusion method. This method avoids architectural changes for new modalities, thereby significantly reducing incremental computing costs and enhancing overall efficiency. Furthermore, the designed multimodal data fusion module possesses strong generalization capabilities and can be seamlessly integrated into other network architectures in a plug-and-play manner, greatly enhancing the module’s practicality and flexibility. To address the challenge of unified multimodal feature extraction, we adopt two key strategies: (1) constructing feature extraction modules with shared but independent weights for each modality to preserve the inherent features of each modality; (2) utilizing cosine similarity to adaptively learn complementary features between different modalities, thereby reducing redundant information. Experimental results demonstrate that M2Cloud achieves or even surpasses the state-of-the-art (SOTA) performance on the public multimodal datasets WHUS2-CD and WHUS2-CD+, verifying its effectiveness in the unified multimodal cloud detection task. The research presented in this paper offers new insights and technical support for the field of multimodal data fusion and cloud detection, and holds significant theoretical and practical value. |
|---|---|
| ISSN: | 1424-8220 |