PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification

Vision transformer (ViT) provides new ideas for polarization synthetic aperture radar (PolSAR) image classification due to its advantages in learning global-spatial information. However, the lack of local-spatial information within samples and correlation information among samples, as well as the co...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinyue Xin, Ming Li, Yan Wu, Xiang Li, Peng Zhang, Dazhi Xu
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10496188/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849330136769363968
author Xinyue Xin
Ming Li
Yan Wu
Xiang Li
Peng Zhang
Dazhi Xu
author_facet Xinyue Xin
Ming Li
Yan Wu
Xiang Li
Peng Zhang
Dazhi Xu
author_sort Xinyue Xin
collection DOAJ
description Vision transformer (ViT) provides new ideas for polarization synthetic aperture radar (PolSAR) image classification due to its advantages in learning global-spatial information. However, the lack of local-spatial information within samples and correlation information among samples, as well as the complexity of network structure, limit the application of ViT in practice. In addition, dual-frequency PolSAR data provide rich information, but there are fewer related studies compared to single-frequency classification algorithms. In this article, we adopt ViT as the basic framework, and propose a novel model based on mixed patch interaction for dual-frequency PolSAR image adaptive fusion classification (PolSAR-MPIformer). First, a mixed patch interaction (MPI) module is designed for the feature extraction, which replaces the high-complexity self-attention in ViT with patch interaction intra- and intersample. Besides the global-spatial information learning within samples by ViT, the MPI module adds the learning of local-spatial information within samples and correlation information among samples, thereby obtaining more discriminative features through a low-complexity network. Subsequently, a dual-frequency adaptive fusion (DAF) module is constructed as the classifier of PolSAR-MPIformer. On the one hand, the attention mechanism is utilized in DAF to reduce the impact of speckle noise while preserving details. On the other hand, the DAF evaluates the classification confidence of each band and assigns different weights accordingly, which achieves reasonable utilization of the complementarity between dual-frequency data and improves classification accuracy. Experiments on four real dual-frequency PolSAR datasets substantiate the superiority of the proposed PolSAR-MPIformer over other state-of-the-art algorithms.
format Article
id doaj-art-9b0a96f965af4e25b0d323b7b62ccfdf
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-9b0a96f965af4e25b0d323b7b62ccfdf2025-08-20T03:47:03ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352024-01-01178527854210.1109/JSTARS.2024.338685410496188PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion ClassificationXinyue Xin0https://orcid.org/0009-0002-1245-8871Ming Li1https://orcid.org/0000-0002-4706-5173Yan Wu2https://orcid.org/0000-0001-7502-2341Xiang Li3Peng Zhang4https://orcid.org/0000-0002-8065-0948Dazhi Xu5https://orcid.org/0000-0001-5942-8878National Laboratory of Radar Signal Processing, Xidian University, Xi'an, ChinaNational Laboratory of Radar Signal Processing, Xidian University, Xi'an, ChinaRemote Sensing Image Processing and Fusion Group, School of Electronics Engineering, Xidian University, Xi'an, ChinaBeijing Institute of Radio Measurement, Beijing, ChinaNational Laboratory of Radar Signal Processing, Xidian University, Xi'an, ChinaNational Laboratory of Radar Signal Processing, Xidian University, Xi'an, ChinaVision transformer (ViT) provides new ideas for polarization synthetic aperture radar (PolSAR) image classification due to its advantages in learning global-spatial information. However, the lack of local-spatial information within samples and correlation information among samples, as well as the complexity of network structure, limit the application of ViT in practice. In addition, dual-frequency PolSAR data provide rich information, but there are fewer related studies compared to single-frequency classification algorithms. In this article, we adopt ViT as the basic framework, and propose a novel model based on mixed patch interaction for dual-frequency PolSAR image adaptive fusion classification (PolSAR-MPIformer). First, a mixed patch interaction (MPI) module is designed for the feature extraction, which replaces the high-complexity self-attention in ViT with patch interaction intra- and intersample. Besides the global-spatial information learning within samples by ViT, the MPI module adds the learning of local-spatial information within samples and correlation information among samples, thereby obtaining more discriminative features through a low-complexity network. Subsequently, a dual-frequency adaptive fusion (DAF) module is constructed as the classifier of PolSAR-MPIformer. On the one hand, the attention mechanism is utilized in DAF to reduce the impact of speckle noise while preserving details. On the other hand, the DAF evaluates the classification confidence of each band and assigns different weights accordingly, which achieves reasonable utilization of the complementarity between dual-frequency data and improves classification accuracy. Experiments on four real dual-frequency PolSAR datasets substantiate the superiority of the proposed PolSAR-MPIformer over other state-of-the-art algorithms.https://ieeexplore.ieee.org/document/10496188/Dual-frequency adaptive fusionmixed patch interactionPolSAR image classificationvision transformer
spellingShingle Xinyue Xin
Ming Li
Yan Wu
Xiang Li
Peng Zhang
Dazhi Xu
PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Dual-frequency adaptive fusion
mixed patch interaction
PolSAR image classification
vision transformer
title PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
title_full PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
title_fullStr PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
title_full_unstemmed PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
title_short PolSAR-MPIformer: A Vision Transformer Based on Mixed Patch Interaction for Dual-Frequency PolSAR Image Adaptive Fusion Classification
title_sort polsar mpiformer a vision transformer based on mixed patch interaction for dual frequency polsar image adaptive fusion classification
topic Dual-frequency adaptive fusion
mixed patch interaction
PolSAR image classification
vision transformer
url https://ieeexplore.ieee.org/document/10496188/
work_keys_str_mv AT xinyuexin polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification
AT mingli polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification
AT yanwu polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification
AT xiangli polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification
AT pengzhang polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification
AT dazhixu polsarmpiformeravisiontransformerbasedonmixedpatchinteractionfordualfrequencypolsarimageadaptivefusionclassification