Lightweight Multi-Head MambaOut with CosTaylorFormer for Hyperspectral Image Classification

Unmanned aerial vehicles (UAVs) equipped with hyperspectral hardware systems are widely used in urban planning and land classification. However, hyperspectral sensors generate large volumes of data that are rich in both spatial and spectral information, making its efficient processing in resource-co...

Full description

Saved in:
Bibliographic Details
Main Authors: Yi Liu, Yanjun Zhang, Jianhong Zhang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/11/1864
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unmanned aerial vehicles (UAVs) equipped with hyperspectral hardware systems are widely used in urban planning and land classification. However, hyperspectral sensors generate large volumes of data that are rich in both spatial and spectral information, making its efficient processing in resource-constrained devices challenging. While transformers have been widely adopted for hyperspectral image classification due to their global feature extraction capabilities, their quadratic computational complexity limits their applicability for resource-constrained devices. To address this limitation and enable the real-time processing of hyperspectral data on UAVs, we propose a lightweight multi-head MambaOut with a CosTaylorFormer (LMHMambaOut-CosTaylorFormer). First, 3D-2D CNN is used to extract both spatial and spectral shallow features from hyperspectral images. Following this, one branch employs a linear transformer, CosTaylorFormer, to extract global spectral information. More specifically, we propose CosTaylorFormer with a cosine function, adjusting the weights based on the spectral curve distribution, which is more conducive to establishing long-distance spectral dependencies. Meanwhile, compared with other linearized transformers, the CosTaylorFormer we propose better improves model performance. For the other branch, we propose multi-head MambaOut to extract global spatial features and enhance the network classification effect. Moreover, a dynamic information fusion strategy is proposed to adaptively fuse spatial and spectral information. The proposed network is validated on four datasets (IP, WHU-Longkou, SA, and PU) and compared with several models, demonstrating its superior classification accuracy; however, the number of model parameters is only 0.22 M, thus achieving better balance between model complexity and accuracy.
ISSN:2072-4292