TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation

In recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inabi...

Full description

Saved in:
Bibliographic Details
Main Authors: Tengfei Chai, Zhiguo Xiao, Xiangfeng Shen, Qian Liu, NianFeng Li, Tong Guan, Jia Tian
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820358/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841542537760735232
author Tengfei Chai
Zhiguo Xiao
Xiangfeng Shen
Qian Liu
NianFeng Li
Tong Guan
Jia Tian
author_facet Tengfei Chai
Zhiguo Xiao
Xiangfeng Shen
Qian Liu
NianFeng Li
Tong Guan
Jia Tian
author_sort Tengfei Chai
collection DOAJ
description In recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inability of many image segmentation algorithms to fully capture global context information, low computational efficiency, and insufficient context information fusion. This article integrates the Transformer mechanism and CA mechanism into the DeepLabV3+ network and proposes the TransDeep network. In the encoder, two different backbones, Xception and MobileNetV2, are first used for feature extraction, and a better backbone network is selected. Furthermore, based on the lightweight backbone network, the Transformer mechanism is integrated into the advanced features of the backbone to enhance long-range dependence. Secondly, the Coord Attention module (CA) is added after the low-level features of the backbone to strengthen information such as edge and detail features. Finally, the Coord Attention mechanism (CA) is added after the ASPP module to allow the model to focus on key image features while effectively filtering out irrelevant background information. Experimental results show that the TransDeep network can improve the accuracy of key categories and effectively improve the network’s segmentation accuracy of targets in images. It achieved an MIoU of 73.5% on the Pascal test set and a good performance of 79.95% on the CamVid test set. Compared with the baseline model, they improved by 2.10% and 2.61% respectively.
format Article
id doaj-art-3d82fda4d35648eca0319379d2ef4774
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-3d82fda4d35648eca0319379d2ef47742025-01-14T00:02:19ZengIEEEIEEE Access2169-35362025-01-01136277629110.1109/ACCESS.2024.352506510820358TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic SegmentationTengfei Chai0https://orcid.org/0009-0007-6597-469XZhiguo Xiao1https://orcid.org/0000-0001-6719-0652Xiangfeng Shen2https://orcid.org/0000-0002-2461-5419Qian Liu3NianFeng Li4https://orcid.org/0000-0003-2450-5217Tong Guan5Jia Tian6School of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaIn recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inability of many image segmentation algorithms to fully capture global context information, low computational efficiency, and insufficient context information fusion. This article integrates the Transformer mechanism and CA mechanism into the DeepLabV3+ network and proposes the TransDeep network. In the encoder, two different backbones, Xception and MobileNetV2, are first used for feature extraction, and a better backbone network is selected. Furthermore, based on the lightweight backbone network, the Transformer mechanism is integrated into the advanced features of the backbone to enhance long-range dependence. Secondly, the Coord Attention module (CA) is added after the low-level features of the backbone to strengthen information such as edge and detail features. Finally, the Coord Attention mechanism (CA) is added after the ASPP module to allow the model to focus on key image features while effectively filtering out irrelevant background information. Experimental results show that the TransDeep network can improve the accuracy of key categories and effectively improve the network’s segmentation accuracy of targets in images. It achieved an MIoU of 73.5% on the Pascal test set and a good performance of 79.95% on the CamVid test set. Compared with the baseline model, they improved by 2.10% and 2.61% respectively.https://ieeexplore.ieee.org/document/10820358/Image segmentationsemantic segmentationtransformer mechanism
spellingShingle Tengfei Chai
Zhiguo Xiao
Xiangfeng Shen
Qian Liu
NianFeng Li
Tong Guan
Jia Tian
TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
IEEE Access
Image segmentation
semantic segmentation
transformer mechanism
title TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
title_full TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
title_fullStr TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
title_full_unstemmed TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
title_short TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
title_sort transdeep transformer integrated deeplabv3 for image semantic segmentation
topic Image segmentation
semantic segmentation
transformer mechanism
url https://ieeexplore.ieee.org/document/10820358/
work_keys_str_mv AT tengfeichai transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT zhiguoxiao transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT xiangfengshen transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT qianliu transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT nianfengli transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT tongguan transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation
AT jiatian transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation