SAGNet: Small-Target Attention Guided Network for Urban Remote Sensing Image Real-Time Segmentation

High-precision real-time segmentation of urban remote sensing (URS) images is crucial for emergency response such as urban flood warning and fire spread monitoring. Existing image segmentation methods based on feature fusion and attention mechanisms effectively enhance the features of large targets...

Full description

Saved in:
Bibliographic Details
Main Authors: Shasha Ren, Xiaodong Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11062335/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High-precision real-time segmentation of urban remote sensing (URS) images is crucial for emergency response such as urban flood warning and fire spread monitoring. Existing image segmentation methods based on feature fusion and attention mechanisms effectively enhance the features of large targets in URS images, but introduce noise, which leads to blurring of small target features in the feature space. In this paper, we propose a small-target attention-guided real-time segmentation network (SAGNet) for URS images to optimize spatial features and highlight small targets. Specifically, a feature-guided enhancement (FGE) model is designed by two nonlinear learning operators to learn more discriminative features and filter background noise. In addition, a scale-aware weighted (SAW) loss function is designed to adaptively adjust the loss of various scale objects by dynamic weighting to highlight the features of small scale objects. After quantization and compression of SAGNet, the network parameters was reduced to 11.8 M. Experimental results on three URS datasets show that our network can effectively improve the overall performance compared to the state-of-the-art methods by 6.40% mIoU. And the model segmentation speed reaches 64.3 FPS when processing <inline-formula> <tex-math notation="LaTeX">$512\times 512$ </tex-math></inline-formula> images on a laptop device (NVIDIA GTX Geforce 2080 Ti GPU), which meets the demand of processing images or videos in real-time on mobile devices.
ISSN:2169-3536