UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery

In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of...

Full description

Saved in:
Bibliographic Details
Main Authors: Dongbo Zhou, Huan Wang, Shiyan Pang, Di Chen, Huang Yao, Jie Yu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11098848/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of ground objects, existing methods still face significant challenges in handling small targets and objects with intricate edges. To address these issues, this article proposes a lightweight and efficient semantic segmentation network for UAV remote sensing images, named UAV-FAENet, aiming to improve the detection accuracy of small targets and the quality of edge segmentation. UAV-FAENet adopts an encoder-decoder architecture, integrating a high-low frequency adaptive enhancement module (HLAEM) and a decoder based on a state space modeling mechanism (CSRA-VSS decoder). Specifically, HLAEM decomposes features into low-frequency and high-frequency components in the frequency domain, enhances them separately, and dynamically fuses them through a channel attention mechanism to effectively preserve edge details. The CSRA-VSS decoder introduces a state space model to capture long-range dependencies in feature sequences and employs a cross-scale attention mechanism to enhance feature responses in small target regions. The proposed method achieves mIoU scores of 68.84%, 80.11%, and 70.26% on the UAVid, UDD6, and DroneDeploy UAV datasets, respectively. Our method outperforms existing mainstream approaches and demonstrates strong potential in improving small object recognition accuracy and edge detail recovery, providing a feasible solution for high-precision semantic segmentation in complex urban scenes.
ISSN:1939-1404
2151-1535