A Multi-Branch Attention Fusion Method for Semantic Segmentation of Remote Sensing Images

In recent years, advancements in remote sensing image observation technology have significantly enriched the surface feature information captured in remote sensing images, posing greater challenges for semantic information extraction from remote sensing imagery. While convolutional neural networks (...

Full description

Saved in:
Bibliographic Details
Main Authors: Kaibo Li, Zhenping Qiang, Hong Lin, Xiaorui Wang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/11/1898
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, advancements in remote sensing image observation technology have significantly enriched the surface feature information captured in remote sensing images, posing greater challenges for semantic information extraction from remote sensing imagery. While convolutional neural networks (CNNs) excel at understanding relationships between adjacent image regions, processing multidimensional data requires reliance on attention mechanisms. However, due to the inherent complexity of remote sensing images, most attention mechanisms designed for natural images underperform when applied to remote sensing data. To address these challenges in remote sensing image semantic segmentation, we propose a highly generalizable multi-branch attention fusion method based on shallow and deep features. This approach applies pixel-wise, spatial, and channel attention mechanisms to feature maps fused with shallow and deep features, thereby enhancing the network’s semantic information extraction capability. Through evaluations on the Cityscapes, LoveDA, and WHDLD datasets, we validate the performance of our method in processing remote sensing data. The results demonstrate consistent improvements in segmentation accuracy across most categories, highlighting its strong generalization capability. Specifically, compared to baseline methods, our approach achieves average mIoU improvements of 0.42% and 0.54% on the WHDLD and LoveDA datasets, respectively, significantly enhancing network performance in complex remote sensing scenarios.
ISSN:2072-4292