DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation
Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11045236/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850112432290136064 |
|---|---|
| author | Yanhong Yang Fei Wang Haozheng Zhang Yushan Xue Guodao Zhang Shengyong Chen |
| author_facet | Yanhong Yang Fei Wang Haozheng Zhang Yushan Xue Guodao Zhang Shengyong Chen |
| author_sort | Yanhong Yang |
| collection | DOAJ |
| description | Semantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of multiscale features, thus limiting model performance. In this article, we introduce DCANet, a dual-branch cross-scale feature aggregation network based on an encoder–decoder framework, incorporating visual-state-space (VSS) blocks in the encoder branch to overcome the limitations of conventional convolutional neural networks in capturing global contextual information. Specifically, to mitigate the information redundancy caused by cross-scale residual learning, we propose a distributed feature aggregation strategy. In the fusion path, a single-scale fusion module is innovatively introduced, which effectively aggregates and enhances both local and global features within a single scale. Furthermore, we design a multiscale attention-based decoder block to generate unified key-value representations by integrating features from multiple encoder stages, thereby fully leveraging multiscale features. To further refine feature representations, an adaptive feature refinement module is proposed, fusing spatial details with contextual information. Extensive experiments conducted on the ISPRS and LoveDA datasets demonstrate the effectiveness and potential of the proposed DCANet for semantic segmentation tasks in RS images. |
| format | Article |
| id | doaj-art-59b6864d3443444f84d3ec2d581406e8 |
| institution | OA Journals |
| issn | 1939-1404 2151-1535 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| spelling | doaj-art-59b6864d3443444f84d3ec2d581406e82025-08-20T02:37:23ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118159581597110.1109/JSTARS.2025.358158511045236DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic SegmentationYanhong Yang0https://orcid.org/0000-0003-4547-4659Fei Wang1https://orcid.org/0009-0000-5884-3949Haozheng Zhang2https://orcid.org/0009-0004-3009-5570Yushan Xue3Guodao Zhang4https://orcid.org/0000-0002-6264-5854Shengyong Chen5https://orcid.org/0000-0002-6705-3831School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSchool of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSchool of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaEngineering Research Center of Learning-Based Intelligent System (Ministry of Education), the Key Laboratory of Computer Vision and Systems (Ministry of Education), and the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaInstitute of Intelligent Media Computing, Hangzhou Dianzi University, Hangzhou, ChinaEngineering Research Center of Learning-Based Intelligent System (Ministry of Education), the Key Laboratory of Computer Vision and Systems (Ministry of Education), and the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaSemantic segmentation of remote sensing (RS) images is essential for land cover interpretation in geoscience research. Although existing dual-branch based methods enable feature complementarity, information redundancy during feature extraction and fusion hinders the full and effective utilization of multiscale features, thus limiting model performance. In this article, we introduce DCANet, a dual-branch cross-scale feature aggregation network based on an encoder–decoder framework, incorporating visual-state-space (VSS) blocks in the encoder branch to overcome the limitations of conventional convolutional neural networks in capturing global contextual information. Specifically, to mitigate the information redundancy caused by cross-scale residual learning, we propose a distributed feature aggregation strategy. In the fusion path, a single-scale fusion module is innovatively introduced, which effectively aggregates and enhances both local and global features within a single scale. Furthermore, we design a multiscale attention-based decoder block to generate unified key-value representations by integrating features from multiple encoder stages, thereby fully leveraging multiscale features. To further refine feature representations, an adaptive feature refinement module is proposed, fusing spatial details with contextual information. Extensive experiments conducted on the ISPRS and LoveDA datasets demonstrate the effectiveness and potential of the proposed DCANet for semantic segmentation tasks in RS images.https://ieeexplore.ieee.org/document/11045236/Multiscale fusionremote sensing (RS) imagesemantic segmentationvisual-state-space (VSS) model |
| spellingShingle | Yanhong Yang Fei Wang Haozheng Zhang Yushan Xue Guodao Zhang Shengyong Chen DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Multiscale fusion remote sensing (RS) image semantic segmentation visual-state-space (VSS) model |
| title | DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation |
| title_full | DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation |
| title_fullStr | DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation |
| title_full_unstemmed | DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation |
| title_short | DCANet: A Dual-Branch Cross-Scale Feature Aggregation Network for Remote Sensing Image Semantic Segmentation |
| title_sort | dcanet a dual branch cross scale feature aggregation network for remote sensing image semantic segmentation |
| topic | Multiscale fusion remote sensing (RS) image semantic segmentation visual-state-space (VSS) model |
| url | https://ieeexplore.ieee.org/document/11045236/ |
| work_keys_str_mv | AT yanhongyang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation AT feiwang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation AT haozhengzhang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation AT yushanxue dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation AT guodaozhang dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation AT shengyongchen dcanetadualbranchcrossscalefeatureaggregationnetworkforremotesensingimagesemanticsegmentation |