FishermaskFormer: Lightweight Remote Sensing Scene Classification With Masked Transformer

Remote sensing scene classification (RSSC) is to accurately assign semantic labels to remote sensing images by analyzing scene contents. Recently, many algorithms have made significant progress in improving the classification accuracy of RSSC. However, a large number of parameters and floating point...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei Wu, Xianbin Hu, Zhu Li, Xueliang Luo
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11044321/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Remote sensing scene classification (RSSC) is to accurately assign semantic labels to remote sensing images by analyzing scene contents. Recently, many algorithms have made significant progress in improving the classification accuracy of RSSC. However, a large number of parameters and floating point operations are needed to achieve that end in those approaches, resulting in high complexity. To address the issue, we propose a novel RSSC algorithm, dubbed FishermaskFormer, which aggressively decimates features in the convolutional backbone via a novel masking operation with a proposed fisher discriminant analysis criterion, and then designs a lightweight transformer block to drive the classification loss. This is aimed at offering a flexible and effective framework for preserving classification accuracy while significantly reducing the complexity. The proposed transformer design employs a new grouping index that assigns multiheaded transformer groups by maximizing the information interactions in each group. Compared with leading lightweight RSSC methods, experimental results show this proposed framework achieves higher classification accuracy while having the similar low complexity.
ISSN:1939-1404
2151-1535