Multiscale Feature Reconstruction and Interclass Attention Weighting for Land Cover Classification

Land cover classification has the goal to attribute each pixel of high-resolution remoste sensing image with planimetric category labels (such as vegetation, building, and water). In recent years, many serial deep-learning architectures (features are delivered through a single path, such as in <i...

Full description

Saved in:
Bibliographic Details
Main Authors: Zongqian Zhan, Zirou Xiong, Xin Huang, Chun Yang, Yi Liu, Xin Wang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10356620/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Land cover classification has the goal to attribute each pixel of high-resolution remoste sensing image with planimetric category labels (such as vegetation, building, and water). In recent years, many serial deep-learning architectures (features are delivered through a single path, such as in <italic>ResNet</italic>, <italic>MobileNet</italic>, and <italic>Segformer</italic>) based on convolutional neural networks and attention mechanisms have been widely explored in land cover classification. However, high-resolution remote sensing images typically have abundant textual details, variable scales in objects, large intraclass variance, and similar interclass correlation, which bring challenges to land cover classification. In this work, we present two pluggable modules to further boost serial learning architecture: first, to cope with ambiguous boundaries caused by lost details and fragmented segmentation stemmed from scale variances, a combination of spatial attention and channel attention is proposed for multiscale feature reconstruction (MSFR); second, to mitigate the classification error caused by intraclass variance and interclass correlation, we explore an interclass attention weighting (ICAW) module, which builds feature vectors for each category, and applies a multihead attention model to capture self-attention dependence among different categories. The experimental results demonstrate that the proposed modules are feasible to the existing serial learning architectures and can improve overall accuracy (OA) by 5.64&#x0025; on the ISPRS Vaihingen two-dimensional dataset (using <italic>ResNet50</italic> as a backbone); in particular, the OA values are 80.68&#x0025; and 86.32&#x0025; before and after using the proposed modules, respectively. In addition, compared with other state-of-the-art models, our method can achieve similar or even better classification results, yet offer superior inference performance.
ISSN:1939-1404
2151-1535