Reparameterized Feature Aggregation Convolutional Neural Network for Remote Sensing Scene Image Classification

With the advancement of deep learning techniques, transformer has been introduced into remote sensing scene classification (RSSC). Although transformer performs well in building long-range dependencies, the computational complexity of its self-attention mechanism is proportional to the square of the...

Full description

Saved in:
Bibliographic Details
Main Authors: Cuiping Shi, Mengxiang Ding, Liguo Wang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10999066/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the advancement of deep learning techniques, transformer has been introduced into remote sensing scene classification (RSSC). Although transformer performs well in building long-range dependencies, the computational complexity of its self-attention mechanism is proportional to the square of the input sequence length, which can lead to high computational costs and resource consumption when processing large-scale or high-resolution remote sensing images. In this study, a re-parameterized feature aggregation convolutional neural network (RepFACNN) is proposed. This is a novel network architecture that combines the advantages of convolutional neural networks (CNNs) and transformers, effectively reducing computational complexity by replacing the self-attention module with a reparametrized transformer (RepFormer). First, a RepFormer is constructed to extract multilevel features. Then, a multihead hybrid convolution module is designed to extract spatial features across various scales, enhancing the ability of the model to perceive intricate details and broader contexts simultaneously. Finally, a feature fusion module is introduced, adeptly amalgamating the features from the dual branches to facilitate more accurate and robust classification. To illustrate the effectiveness of the RepFACNN method, numerous experiments were conducted on three commonly used RSSC datasets: UC-Merced, AID, and NWPU. The experimental outcomes demonstrate that RepFACNN outperforms some state-of-the-art scene classification approaches by a large margin.
ISSN:1939-1404
2151-1535