DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation

Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of...

Full description

Saved in:
Bibliographic Details
Main Authors: Yanhong Yang, Fei Wang, Haozheng Zhang, Yushan Xue, Guodao Zhang, Shengyong Chen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11045236/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850112432290136064
author Yanhong Yang
Fei Wang
Haozheng Zhang
Yushan Xue
Guodao Zhang
Shengyong Chen
author_facet Yanhong Yang
Fei Wang
Haozheng Zhang
Yushan Xue
Guodao Zhang
Shengyong Chen
author_sort Yanhong Yang
collection DOAJ
description Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of multiscale features, thus limiting model performance. In this article, we introduce DCANet, a dual-branch cross-scale feature aggregation network based on an encoder–decoder framework, incorporating visual-state-space (VSS) blocks in the encoder branch to overcome the limitations of conventional convolutional neural networks in capturing global contextual information. Specifically, to mitigate the information redundancy caused by cross-scale residual learning, we propose a distributed feature aggregation strategy. In the fusion path, a single-scale fusion module is innovatively introduced, which effectively aggregates and enhances both local and global features within a single scale. Furthermore, we design a multiscale attention-based decoder block to generate unified key-value representations by integrating features from multiple encoder stages, thereby fully leveraging multiscale features. To further refine feature representations, an adaptive feature refinement module is proposed, fusing spatial details with contextual information. Extensive experiments conducted on the ISPRS and LoveDA datasets demonstrate the effectiveness and potential of the proposed DCANet for semantic segmentation tasks in RS images.
format Article
id doaj-art-59b6864d3443444f84d3ec2d581406e8
institution OA Journals
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-59b6864d3443444f84d3ec2d581406e82025-08-20T02:37:23ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118159581597110.1109/JSTARS.2025.358158511045236DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic SegmentationYanhong Yang0https://orcid.org/0000-0003-4547-4659Fei Wang1https://orcid.org/0009-0000-5884-3949Haozheng Zhang2https://orcid.org/0009-0004-3009-5570Yushan Xue3Guodao Zhang4https://orcid.org/0000-0002-6264-5854Shengyong Chen5https://orcid.org/0000-0002-6705-3831School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSchool of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSchool of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaEngineering Research Center of Learning-Based Intelligent System (Ministry of Education), the Key Laboratory of Computer Vision and Systems (Ministry of Education), and the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaInstitute of Intelligent Media Computing, Hangzhou Dianzi University, Hangzhou, ChinaEngineering Research Center of Learning-Based Intelligent System (Ministry of Education), the Key Laboratory of Computer Vision and Systems (Ministry of Education), and the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSemantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of multiscale features, thus limiting model performance. In this article, we introduce DCANet, a dual-branch cross-scale feature aggregation network based on an encoder–decoder framework, incorporating visual-state-space (VSS) blocks in the encoder branch to overcome the limitations of conventional convolutional neural networks in capturing global contextual information. Specifically, to mitigate the information redundancy caused by cross-scale residual learning, we propose a distributed feature aggregation strategy. In the fusion path, a single-scale fusion module is innovatively introduced, which effectively aggregates and enhances both local and global features within a single scale. Furthermore, we design a multiscale attention-based decoder block to generate unified key-value representations by integrating features from multiple encoder stages, thereby fully leveraging multiscale features. To further refine feature representations, an adaptive feature refinement module is proposed, fusing spatial details with contextual information. Extensive experiments conducted on the ISPRS and LoveDA datasets demonstrate the effectiveness and potential of the proposed DCANet for semantic segmentation tasks in RS images.https://ieeexplore.ieee.org/document/11045236/Multiscale fusionremote sensing (RS) imagesemantic segmentationvisual-state-space (VSS) model
spellingShingle Yanhong Yang
Fei Wang
Haozheng Zhang
Yushan Xue
Guodao Zhang
Shengyong Chen
DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Multiscale fusion
remote sensing (RS) image
semantic segmentation
visual-state-space (VSS) model
title DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
title_full DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
title_fullStr DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
title_full_unstemmed DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
title_short DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
title_sort dcanet a dual branch cross scale feature aggregation network for remote sensing image semantic segmentation
topic Multiscale fusion
remote sensing (RS) image
semantic segmentation
visual-state-space (VSS) model
url https://ieeexplore.ieee.org/document/11045236/
work_keys_str_mv AT yanhongyang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation
AT feiwang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation
AT haozhengzhang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation
AT yushanxue dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation
AT guodaozhang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation
AT shengyongchen dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation