TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.

In the field of earth sciences and remote exploration, the classification and identification of surface materials on earth have been a significant research area that poses considerable challenges in recent times. Although deep learning technology has achieved certain results in remote sensing image...

Full description

Saved in:

Bibliographic Details
Main Authors:	Huiqing Wang, Huajun Wang, Linfen Wu
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2025-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0316900
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850279423040815104
author	Huiqing Wang Huajun Wang Linfen Wu
author_facet	Huiqing Wang Huajun Wang Linfen Wu
author_sort	Huiqing Wang
collection	DOAJ
description	In the field of earth sciences and remote exploration, the classification and identification of surface materials on earth have been a significant research area that poses considerable challenges in recent times. Although deep learning technology has achieved certain results in remote sensing image classification, it still has certain challenges for multi-modality remote sensing data classification. In this paper, we propose a fusion network based on transformer and gist convolutional neural network (CNN), namely TGF-Net. To minimize the duplication of information in multimodal data, the TGF-Net network incorporates a feature reconstruction module (FRM) that employs matrix factorization and self-attention mechanism for decomposing and evaluating the similarity of multimodal features. This enables the extraction of distinct as well as common features. Meanwhile, the transformer-based spectral feature extraction module (TSFEM) was designed by combining the different characteristics of remote sensing images and considering the problem of orderliness of the sequence between hyperspectral image (HSI) channels. In order to address the issue of representing the relative positions of spatial targets in synthetic aperture radar (SAR) images, we proposed a spatial feature extraction module called gist-based spatial feature extraction module (GSFEM). To assess the efficacy and superiority of the proposed TGF-Net, we performed experiments on two datasets comprising HSI and SAR data.
format	Article
id	doaj-art-e284c96f5491449fbb970810e1ae25c0
institution	OA Journals
issn	1932-6203
language	English
publishDate	2025-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-e284c96f5491449fbb970810e1ae25c02025-08-20T01:49:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031690010.1371/journal.pone.0316900TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.Huiqing WangHuajun WangLinfen WuIn the field of earth sciences and remote exploration, the classification and identification of surface materials on earth have been a significant research area that poses considerable challenges in recent times. Although deep learning technology has achieved certain results in remote sensing image classification, it still has certain challenges for multi-modality remote sensing data classification. In this paper, we propose a fusion network based on transformer and gist convolutional neural network (CNN), namely TGF-Net. To minimize the duplication of information in multimodal data, the TGF-Net network incorporates a feature reconstruction module (FRM) that employs matrix factorization and self-attention mechanism for decomposing and evaluating the similarity of multimodal features. This enables the extraction of distinct as well as common features. Meanwhile, the transformer-based spectral feature extraction module (TSFEM) was designed by combining the different characteristics of remote sensing images and considering the problem of orderliness of the sequence between hyperspectral image (HSI) channels. In order to address the issue of representing the relative positions of spatial targets in synthetic aperture radar (SAR) images, we proposed a spatial feature extraction module called gist-based spatial feature extraction module (GSFEM). To assess the efficacy and superiority of the proposed TGF-Net, we performed experiments on two datasets comprising HSI and SAR data.https://doi.org/10.1371/journal.pone.0316900
spellingShingle	Huiqing Wang Huajun Wang Linfen Wu TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification. PLoS ONE
title	TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.
title_full	TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.
title_fullStr	TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.
title_full_unstemmed	TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.
title_short	TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.
title_sort	tgf net transformer and gist cnn fusion network for multi modal remote sensing image classification
url	https://doi.org/10.1371/journal.pone.0316900
work_keys_str_mv	AT huiqingwang tgfnettransformerandgistcnnfusionnetworkformultimodalremotesensingimageclassification AT huajunwang tgfnettransformerandgistcnnfusionnetworkformultimodalremotesensingimageclassification AT linfenwu tgfnettransformerandgistcnnfusionnetworkformultimodalremotesensingimageclassification

TGF-Net: Transformer and gist CNN fusion network for multi-modal remote sensing image classification.

Similar Items