MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi-modal remote sensing images
The heterogeneity and complexity of multimodal data in high-resolution remote sensing images significantly challenges existing cross-modal networks in fusing the complementary information of high-resolution optical and synthetic aperture radar (SAR) images for precise semantic segmentation. To addre...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | International Journal of Applied Earth Observations and Geoinformation |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843224005971 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The heterogeneity and complexity of multimodal data in high-resolution remote sensing images significantly challenges existing cross-modal networks in fusing the complementary information of high-resolution optical and synthetic aperture radar (SAR) images for precise semantic segmentation. To address this issue, this paper proposes a multi-layer perceptron (MLP) dominated gate fusion network (MGFNet). MGFNet consists of three modules: a multi-path feature extraction network, an MLP-gate fusion (MGF) module, and a decoder. Initially, MGFNet independently extracts features from high-resolution optical and SAR images while preserving spatial information. Then, the well-designed MGF module combines the multi-modal features through channel attention and gated fusion stages, utilizing MLP as a gate to exploit complementary information and filter redundant data. Additionally, we introduce a novel high-resolution multimodal remote sensing dataset, YESeg-OPT-SAR, with a spatial resolution of 0.5 m. To evaluate MGFNet, we compare it with several state-of-the-art (SOTA) models using YESeg-OPT-SAR and Pohang datasets, both of which are high-resolution multi-modal datasets. The experimental results demonstrate that MGFNet achieves higher evaluation metrics compared to other models, indicating its effectiveness in multi-modal feature fusion for segmentation. The source code and data are available at https://github.com/yeyuanxin110/YESeg-OPT-SAR. |
|---|---|
| ISSN: | 1569-8432 |