Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation

In semantic segmentation tasks, the transition from convolutional neural networks (CNNs) to transformers is driven by the latter's superior ability to capture global semantic information in remote sensing images. However, most transformer methods face challenges such as slow inference spe...

Full description

Saved in:
Bibliographic Details
Main Authors: Kang Zheng, Yu Chen, Jingrong Wang, Zhifei Liu, Shuai Bao, Jiao Zhan, Nan Shen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10839278/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576743598194688
author Kang Zheng
Yu Chen
Jingrong Wang
Zhifei Liu
Shuai Bao
Jiao Zhan
Nan Shen
author_facet Kang Zheng
Yu Chen
Jingrong Wang
Zhifei Liu
Shuai Bao
Jiao Zhan
Nan Shen
author_sort Kang Zheng
collection DOAJ
description In semantic segmentation tasks, the transition from convolutional neural networks (CNNs) to transformers is driven by the latter's superior ability to capture global semantic information in remote sensing images. However, most transformer methods face challenges such as slow inference speed and limitations in capturing local features. To address these issues, this study designs a hybrid approach that integrates knowledge distillation with a combination of CNN and transformer to enhance semantic segmentation in remote sensing images. First, this article proposes the dual-path convolutional transformer network (DP-CTNet) with a dual-path structure to leverage the strengths of both CNN and transformers. It incorporates a feature refinement module to optimize the transformer's feature learning, and a feature fusion module to effectively merge CNN and transformer features, preventing the insufficient learning of local features by the transformer. Then, DP-CTNet serves as the teacher model, and pruning and knowledge distillation are employed to create efficient DP-CTNet (EDP-CTNet) with superior segmentation speed and accuracy. Angle knowledge distillation (AKD) is proposed to enhance the feature migration learning of DP-CTNet during knowledge distillation, leading to improved EDP-CTNet performance. Experimental results demonstrate that DP-CTNet thoroughly combines the respective advantages of CNN and Transformer, maintaining local detail features while learning extensive sequential semantic information. EDP-CTNet not only delivers impressive segmentation speed but also exhibits excellent segmentation accuracy following AKD training. In comparison to other models, the two models proposed in this article notably distinguish themselves in terms of accuracy and result visualization.
format Article
id doaj-art-2d34a1aeba5f4fbf83793d36e2753fcd
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-2d34a1aeba5f4fbf83793d36e2753fcd2025-01-31T00:00:24ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01184074409210.1109/JSTARS.2025.352563410839278Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge DistillationKang Zheng0https://orcid.org/0000-0002-4811-0001Yu Chen1https://orcid.org/0000-0002-2825-6767Jingrong Wang2https://orcid.org/0000-0002-4226-5803Zhifei Liu3https://orcid.org/0009-0009-4707-9346Shuai Bao4https://orcid.org/0000-0003-2275-0147Jiao Zhan5https://orcid.org/0009-0003-8519-2207Nan Shen6https://orcid.org/0000-0001-8468-5550College of Geography and Environment Science, Henan University, Kaifeng, ChinaSchool of Geomatics and Technology, Nanjing Tech University, Nanjing, ChinaGNSS Research Center, Wuhan University, Wuhan, ChinaDepartment of Aerospace and Geodesy, Technical University of Munich, Munich, GermanyChina and Research Center of Geospatial Big Data Application, Chinese Academy of Surveying and Mapping, Beijing, ChinaGNSS Research Center, Wuhan University, Wuhan, ChinaSchool of Geomatics and Technology, Nanjing Tech University, Nanjing, ChinaIn semantic segmentation tasks, the transition from convolutional neural networks (CNNs) to transformers is driven by the latter's superior ability to capture global semantic information in remote sensing images. However, most transformer methods face challenges such as slow inference speed and limitations in capturing local features. To address these issues, this study designs a hybrid approach that integrates knowledge distillation with a combination of CNN and transformer to enhance semantic segmentation in remote sensing images. First, this article proposes the dual-path convolutional transformer network (DP-CTNet) with a dual-path structure to leverage the strengths of both CNN and transformers. It incorporates a feature refinement module to optimize the transformer's feature learning, and a feature fusion module to effectively merge CNN and transformer features, preventing the insufficient learning of local features by the transformer. Then, DP-CTNet serves as the teacher model, and pruning and knowledge distillation are employed to create efficient DP-CTNet (EDP-CTNet) with superior segmentation speed and accuracy. Angle knowledge distillation (AKD) is proposed to enhance the feature migration learning of DP-CTNet during knowledge distillation, leading to improved EDP-CTNet performance. Experimental results demonstrate that DP-CTNet thoroughly combines the respective advantages of CNN and Transformer, maintaining local detail features while learning extensive sequential semantic information. EDP-CTNet not only delivers impressive segmentation speed but also exhibits excellent segmentation accuracy following AKD training. In comparison to other models, the two models proposed in this article notably distinguish themselves in terms of accuracy and result visualization.https://ieeexplore.ieee.org/document/10839278/Convolutional neural network (CNN)remote sensingsemantic segmentationtransformer
spellingShingle Kang Zheng
Yu Chen
Jingrong Wang
Zhifei Liu
Shuai Bao
Jiao Zhan
Nan Shen
Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Convolutional neural network (CNN)
remote sensing
semantic segmentation
transformer
title Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
title_full Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
title_fullStr Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
title_full_unstemmed Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
title_short Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
title_sort enhancing remote sensing semantic segmentation accuracy and efficiency through transformer and knowledge distillation
topic Convolutional neural network (CNN)
remote sensing
semantic segmentation
transformer
url https://ieeexplore.ieee.org/document/10839278/
work_keys_str_mv AT kangzheng enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT yuchen enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT jingrongwang enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT zhifeiliu enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT shuaibao enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT jiaozhan enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation
AT nanshen enhancingremotesensingsemanticsegmentationaccuracyandefficiencythroughtransformerandknowledgedistillation