DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation

Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of...

Full description

Saved in:
Bibliographic Details
Main Authors: Yanhong Yang, Fei Wang, Haozheng Zhang, Yushan Xue, Guodao Zhang, Shengyong Chen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11045236/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of multiscale features, thus limiting model performance. In this article, we introduce DCANet, a dual-branch cross-scale feature aggregation network based on an encoder–decoder framework, incorporating visual-state-space (VSS) blocks in the encoder branch to overcome the limitations of conventional convolutional neural networks in capturing global contextual information. Specifically, to mitigate the information redundancy caused by cross-scale residual learning, we propose a distributed feature aggregation strategy. In the fusion path, a single-scale fusion module is innovatively introduced, which effectively aggregates and enhances both local and global features within a single scale. Furthermore, we design a multiscale attention-based decoder block to generate unified key-value representations by integrating features from multiple encoder stages, thereby fully leveraging multiscale features. To further refine feature representations, an adaptive feature refinement module is proposed, fusing spatial details with contextual information. Extensive experiments conducted on the ISPRS and LoveDA datasets demonstrate the effectiveness and potential of the proposed DCANet for semantic segmentation tasks in RS images.
ISSN:1939-1404
2151-1535