GLFFNet: Global–Local Feature Fusion Network for High-Resolution Remote Sensing Image Semantic Segmentation

Although hybrid models based on convolutional neural network (CNN) and Transformer can extract features encompassing both global and local information, they still face two challenges in addressing the semantic segmentation task of high-resolution remote sensing (HR<sup>2</sup>S) images....

Full description

Saved in:

Bibliographic Details
Main Authors:	Saifeng Zhu, Liaoying Zhao, Qingjiang Xiao, Jigang Ding, Xiaorun Li
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Remote Sensing
Subjects:	high-resolution remote sensing images semantic segmentation detailed information global–local feature fusion VMamba
Online Access:	https://www.mdpi.com/2072-4292/17/6/1019
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Although hybrid models based on convolutional neural network (CNN) and Transformer can extract features encompassing both global and local information, they still face two challenges in addressing the semantic segmentation task of high-resolution remote sensing (HR<sup>2</sup>S) images. First, they are limited by the loss of detailed information during encoding, resulting in inadequate utilization of features. Second, the ineffective fusion of local and global context information leads to unsatisfactory segmentation performance. To simultaneously address these two challenges, we propose a dual-branch network named global–local feature fusion network (GLFFNet) for HR<sup>2</sup>S image semantic segmentation. Specifically, we use the residual network (ResNet) as the main branch to extract local features. Recently, a Mamba architecture based on State Space Models has shown significant potential in image semantic segmentation tasks. Given that Mamba is capable of handling long-range relationships with linear computational complexity and relatively high speed, we introduce VMamba as an auxiliary branch encoder to provide global information for the main branch. Meanwhile, in order to utilize global information efficiently, we propose a multi-scale feature refinement (MSFR) module to reduce the loss of details during global feature extraction. Additionally, we develop a semantic bridging fusion (SBF) module to promote the full integration of global and local features, resulting in more comprehensive and refined feature representations. Comparative experiments on three public datasets demonstrate the segmentation accuracy and application potential of GLFFNet. Specifically, GLFFNet achieves mIoU scores of 84.01% on ISPRS Vaihingen, 87.54% on ISPRS Potsdam, and 54.73% on LoveDA, as well as mF1 scores of 91.11%, 93.23%, and 70.07% on these respective datasets.
ISSN:	2072-4292

GLFFNet: Global–Local Feature Fusion Network for High-Resolution Remote Sensing Image Semantic Segmentation

Similar Items