SuperCoT-X: Masked Hyperspectral Image Modeling With Diverse Superpixel-Level Contrastive Tokenizer
Hyperspectral images (HSI) exhibit complex contextual relationships, including variations in local homogeneous regions and spectral similarities among different classes. Contrastive masked patch embedding prediction specializes in capturing rich, high-level visual context from neighborhoods. However...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11072320/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Hyperspectral images (HSI) exhibit complex contextual relationships, including variations in local homogeneous regions and spectral similarities among different classes. Contrastive masked patch embedding prediction specializes in capturing rich, high-level visual context from neighborhoods. However, it is a challenge to balance representation certainty against intraclass diversity. Pursuing neighbor diversity through token-level contrast can disrupt the certainty of intraclass representation. To address this issue, we propose a superpixel-level contrastive tokenizer (SuperCoT) for masked HSI modeling. It performs mask prediction with superpixel-calibrated targets, enhancing representation certainty in homogeneous regions. In addition, to mitigate the contextual semantic loss resulting from excessively consistent representations within clusters in SuperCoT, we introduce an intracluster diversity regularization (SuperCoT-D) into the superpixel-level denoising contrast loss. Furthermore, to reduce the computational burden of aggregating superpixel tokens during each iteration in SuperCoT-D, we suggest an alternative approach, SuperCoT-M, which preserves a prototypical dictionary updated through momentum that refers to superpixel labels and implicitly improves the diversity of intracluster representations. Comprehensive experiments on five HSI datasets demonstrate that our proposed methods achieve favorable results and are competitive with other self-supervised approaches. |
|---|---|
| ISSN: | 1939-1404 2151-1535 |