UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11098848/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of ground objects, existing methods still face significant challenges in handling small targets and objects with intricate edges. To address these issues, this article proposes a lightweight and efficient semantic segmentation network for UAV remote sensing images, named UAV-FAENet, aiming to improve the detection accuracy of small targets and the quality of edge segmentation. UAV-FAENet adopts an encoder-decoder architecture, integrating a high-low frequency adaptive enhancement module (HLAEM) and a decoder based on a state space modeling mechanism (CSRA-VSS decoder). Specifically, HLAEM decomposes features into low-frequency and high-frequency components in the frequency domain, enhances them separately, and dynamically fuses them through a channel attention mechanism to effectively preserve edge details. The CSRA-VSS decoder introduces a state space model to capture long-range dependencies in feature sequences and employs a cross-scale attention mechanism to enhance feature responses in small target regions. The proposed method achieves mIoU scores of 68.84%, 80.11%, and 70.26% on the UAVid, UDD6, and DroneDeploy UAV datasets, respectively. Our method outperforms existing mainstream approaches and demonstrates strong potential in improving small object recognition accuracy and edge detail recovery, providing a feasible solution for high-precision semantic segmentation in complex urban scenes. |
|---|---|
| ISSN: | 1939-1404 2151-1535 |