UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery

In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of...

Full description

Saved in:
Bibliographic Details
Main Authors: Dongbo Zhou, Huan Wang, Shiyan Pang, Di Chen, Huang Yao, Jie Yu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11098848/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849340553040232448
author Dongbo Zhou
Huan Wang
Shiyan Pang
Di Chen
Huang Yao
Jie Yu
author_facet Dongbo Zhou
Huan Wang
Shiyan Pang
Di Chen
Huang Yao
Jie Yu
author_sort Dongbo Zhou
collection DOAJ
description In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of ground objects, existing methods still face significant challenges in handling small targets and objects with intricate edges. To address these issues, this article proposes a lightweight and efficient semantic segmentation network for UAV remote sensing images, named UAV-FAENet, aiming to improve the detection accuracy of small targets and the quality of edge segmentation. UAV-FAENet adopts an encoder-decoder architecture, integrating a high-low frequency adaptive enhancement module (HLAEM) and a decoder based on a state space modeling mechanism (CSRA-VSS decoder). Specifically, HLAEM decomposes features into low-frequency and high-frequency components in the frequency domain, enhances them separately, and dynamically fuses them through a channel attention mechanism to effectively preserve edge details. The CSRA-VSS decoder introduces a state space model to capture long-range dependencies in feature sequences and employs a cross-scale attention mechanism to enhance feature responses in small target regions. The proposed method achieves mIoU scores of 68.84%, 80.11%, and 70.26% on the UAVid, UDD6, and DroneDeploy UAV datasets, respectively. Our method outperforms existing mainstream approaches and demonstrates strong potential in improving small object recognition accuracy and edge detail recovery, providing a feasible solution for high-precision semantic segmentation in complex urban scenes.
format Article
id doaj-art-aef66cbf84984d24a25977188dce25cd
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-aef66cbf84984d24a25977188dce25cd2025-08-20T03:43:52ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118198531986810.1109/JSTARS.2025.359355711098848UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV ImageryDongbo Zhou0https://orcid.org/0000-0001-8682-6281Huan Wang1https://orcid.org/0009-0001-9779-6433Shiyan Pang2https://orcid.org/0000-0002-1713-3972Di Chen3https://orcid.org/0009-0000-3863-7566Huang Yao4Jie Yu5https://orcid.org/0000-0002-8519-1760Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, ChinaFaculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, ChinaFaculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, ChinaFaculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, ChinaFaculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, ChinaState Key Laboratory of Information Engineering in Surveying, Wuhan University, Wuhan, ChinaIn recent years, with the rapid development of unmanned aerial vehicle (UAV) technology, low-altitude remote sensing imagery has been increasingly applied in urban planning, environmental monitoring, and disaster management. However, due to large variations in object scales and complex boundaries of ground objects, existing methods still face significant challenges in handling small targets and objects with intricate edges. To address these issues, this article proposes a lightweight and efficient semantic segmentation network for UAV remote sensing images, named UAV-FAENet, aiming to improve the detection accuracy of small targets and the quality of edge segmentation. UAV-FAENet adopts an encoder-decoder architecture, integrating a high-low frequency adaptive enhancement module (HLAEM) and a decoder based on a state space modeling mechanism (CSRA-VSS decoder). Specifically, HLAEM decomposes features into low-frequency and high-frequency components in the frequency domain, enhances them separately, and dynamically fuses them through a channel attention mechanism to effectively preserve edge details. The CSRA-VSS decoder introduces a state space model to capture long-range dependencies in feature sequences and employs a cross-scale attention mechanism to enhance feature responses in small target regions. The proposed method achieves mIoU scores of 68.84%, 80.11%, and 70.26% on the UAVid, UDD6, and DroneDeploy UAV datasets, respectively. Our method outperforms existing mainstream approaches and demonstrates strong potential in improving small object recognition accuracy and edge detail recovery, providing a feasible solution for high-precision semantic segmentation in complex urban scenes.https://ieeexplore.ieee.org/document/11098848/Attention mechanismhigh-low frequency enhancementremote sensingsemantic segmentationunmanned aerial vehicle (UAV) images
spellingShingle Dongbo Zhou
Huan Wang
Shiyan Pang
Di Chen
Huang Yao
Jie Yu
UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Attention mechanism
high-low frequency enhancement
remote sensing
semantic segmentation
unmanned aerial vehicle (UAV) images
title UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
title_full UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
title_fullStr UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
title_full_unstemmed UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
title_short UAV-FAENet: Frequency-Aware and Attention-Enhanced Network for Remote Sensing Semantic Segmentation of UAV Imagery
title_sort uav faenet frequency aware and attention enhanced network for remote sensing semantic segmentation of uav imagery
topic Attention mechanism
high-low frequency enhancement
remote sensing
semantic segmentation
unmanned aerial vehicle (UAV) images
url https://ieeexplore.ieee.org/document/11098848/
work_keys_str_mv AT dongbozhou uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery
AT huanwang uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery
AT shiyanpang uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery
AT dichen uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery
AT huangyao uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery
AT jieyu uavfaenetfrequencyawareandattentionenhancednetworkforremotesensingsemanticsegmentationofuavimagery