A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion

Abstract Accurate road information is critical for intelligent navigation and urban planning. Compared with traditional road detection methods, deep learning-based approaches have demonstrated significant advantages in road extraction from remote sensing imagery. However, challenges such as occlusio...

Full description

Saved in:
Bibliographic Details
Main Authors: Haoming Bai, Chao Ren, Zhenzhong Huang, Yao Gu
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-02267-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850124788655194112
author Haoming Bai
Chao Ren
Zhenzhong Huang
Yao Gu
author_facet Haoming Bai
Chao Ren
Zhenzhong Huang
Yao Gu
author_sort Haoming Bai
collection DOAJ
description Abstract Accurate road information is critical for intelligent navigation and urban planning. Compared with traditional road detection methods, deep learning-based approaches have demonstrated significant advantages in road extraction from remote sensing imagery. However, challenges such as occlusion by vegetation and buildings, as well as the similarity between roads and surrounding objects, often lead to incomplete road extraction. To address these issues, we propose a novel deep learning model, RISENet, which consists of three main components: a dual-branch fusion encoder, a multi-layer dynamic spatial channel fusion attention mechanism (MCSA), and a hybrid feature dilation-aware decoder. The dual-branch encoder leverages dual convolutions and multi-head deep convolutions to extract fundamental features and capture fine-grained details. The feature fusion module integrates both global and local information, enhancing the model’s ability to represent features effectively. The MCSA captures long-range dependencies within remote sensing images, improving the differentiation between roads and other objects. The dilation-aware decoder dynamically expands the receptive field, preserving global features while reducing the loss of fine details. The proposed RISENet was comprehensively evaluated on three distinct road segmentation benchmarks, demonstrating superior accuracies of 90.04%, 92.24%, and 88.18% respectively. In terms of visual quality and quantitative indicators, the method proposed in this study demonstrates excellent performance. The ablation experiments have also confirmed the effectiveness of the adopted loss function and fusion strategy. These fully indicate that RISENet performs remarkably well in road segmentation tasks across various datasets and exhibits considerable robustness.
format Article
id doaj-art-c175efe7265c44e3b756992f4f7e26dd
institution OA Journals
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-c175efe7265c44e3b756992f4f7e26dd2025-08-20T02:34:14ZengNature PortfolioScientific Reports2045-23222025-05-0115112210.1038/s41598-025-02267-6A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusionHaoming Bai0Chao Ren1Zhenzhong Huang2Yao Gu3College of Geomatics and Geoinformation, Guilin University of TechnologyCollege of Geomatics and Geoinformation, Guilin University of TechnologyCollege of Geomatics and Geoinformation, Guilin University of TechnologyCollege of Geomatics and Geoinformation, Guilin University of TechnologyAbstract Accurate road information is critical for intelligent navigation and urban planning. Compared with traditional road detection methods, deep learning-based approaches have demonstrated significant advantages in road extraction from remote sensing imagery. However, challenges such as occlusion by vegetation and buildings, as well as the similarity between roads and surrounding objects, often lead to incomplete road extraction. To address these issues, we propose a novel deep learning model, RISENet, which consists of three main components: a dual-branch fusion encoder, a multi-layer dynamic spatial channel fusion attention mechanism (MCSA), and a hybrid feature dilation-aware decoder. The dual-branch encoder leverages dual convolutions and multi-head deep convolutions to extract fundamental features and capture fine-grained details. The feature fusion module integrates both global and local information, enhancing the model’s ability to represent features effectively. The MCSA captures long-range dependencies within remote sensing images, improving the differentiation between roads and other objects. The dilation-aware decoder dynamically expands the receptive field, preserving global features while reducing the loss of fine details. The proposed RISENet was comprehensively evaluated on three distinct road segmentation benchmarks, demonstrating superior accuracies of 90.04%, 92.24%, and 88.18% respectively. In terms of visual quality and quantitative indicators, the method proposed in this study demonstrates excellent performance. The ablation experiments have also confirmed the effectiveness of the adopted loss function and fusion strategy. These fully indicate that RISENet performs remarkably well in road segmentation tasks across various datasets and exhibits considerable robustness.https://doi.org/10.1038/s41598-025-02267-6Dual-branch fusion encoderHybrid feature dilation-aware decoderDynamic attention mechanismRoad segmentationRemote sensing
spellingShingle Haoming Bai
Chao Ren
Zhenzhong Huang
Yao Gu
A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
Scientific Reports
Dual-branch fusion encoder
Hybrid feature dilation-aware decoder
Dynamic attention mechanism
Road segmentation
Remote sensing
title A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
title_full A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
title_fullStr A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
title_full_unstemmed A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
title_short A dynamic attention mechanism for road extraction from high-resolution remote sensing imagery using feature fusion
title_sort dynamic attention mechanism for road extraction from high resolution remote sensing imagery using feature fusion
topic Dual-branch fusion encoder
Hybrid feature dilation-aware decoder
Dynamic attention mechanism
Road segmentation
Remote sensing
url https://doi.org/10.1038/s41598-025-02267-6
work_keys_str_mv AT haomingbai adynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT chaoren adynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT zhenzhonghuang adynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT yaogu adynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT haomingbai dynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT chaoren dynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT zhenzhonghuang dynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion
AT yaogu dynamicattentionmechanismforroadextractionfromhighresolutionremotesensingimageryusingfeaturefusion