TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation
In recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inabi...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10820358/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841542537760735232 |
---|---|
author | Tengfei Chai Zhiguo Xiao Xiangfeng Shen Qian Liu NianFeng Li Tong Guan Jia Tian |
author_facet | Tengfei Chai Zhiguo Xiao Xiangfeng Shen Qian Liu NianFeng Li Tong Guan Jia Tian |
author_sort | Tengfei Chai |
collection | DOAJ |
description | In recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inability of many image segmentation algorithms to fully capture global context information, low computational efficiency, and insufficient context information fusion. This article integrates the Transformer mechanism and CA mechanism into the DeepLabV3+ network and proposes the TransDeep network. In the encoder, two different backbones, Xception and MobileNetV2, are first used for feature extraction, and a better backbone network is selected. Furthermore, based on the lightweight backbone network, the Transformer mechanism is integrated into the advanced features of the backbone to enhance long-range dependence. Secondly, the Coord Attention module (CA) is added after the low-level features of the backbone to strengthen information such as edge and detail features. Finally, the Coord Attention mechanism (CA) is added after the ASPP module to allow the model to focus on key image features while effectively filtering out irrelevant background information. Experimental results show that the TransDeep network can improve the accuracy of key categories and effectively improve the network’s segmentation accuracy of targets in images. It achieved an MIoU of 73.5% on the Pascal test set and a good performance of 79.95% on the CamVid test set. Compared with the baseline model, they improved by 2.10% and 2.61% respectively. |
format | Article |
id | doaj-art-3d82fda4d35648eca0319379d2ef4774 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-3d82fda4d35648eca0319379d2ef47742025-01-14T00:02:19ZengIEEEIEEE Access2169-35362025-01-01136277629110.1109/ACCESS.2024.352506510820358TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic SegmentationTengfei Chai0https://orcid.org/0009-0007-6597-469XZhiguo Xiao1https://orcid.org/0000-0001-6719-0652Xiangfeng Shen2https://orcid.org/0000-0002-2461-5419Qian Liu3NianFeng Li4https://orcid.org/0000-0003-2450-5217Tong Guan5Jia Tian6School of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaSchool of Computer Science and Technology, Changchun University, Changchun, Jilin, ChinaIn recent years, image semantic segmentation algorithms have made significant progress driven by deep learning technology, and are widely used in fields such as medical image analysis, assistive technology for the visually impaired people, and autonomous driving. Aiming at problems such as the inability of many image segmentation algorithms to fully capture global context information, low computational efficiency, and insufficient context information fusion. This article integrates the Transformer mechanism and CA mechanism into the DeepLabV3+ network and proposes the TransDeep network. In the encoder, two different backbones, Xception and MobileNetV2, are first used for feature extraction, and a better backbone network is selected. Furthermore, based on the lightweight backbone network, the Transformer mechanism is integrated into the advanced features of the backbone to enhance long-range dependence. Secondly, the Coord Attention module (CA) is added after the low-level features of the backbone to strengthen information such as edge and detail features. Finally, the Coord Attention mechanism (CA) is added after the ASPP module to allow the model to focus on key image features while effectively filtering out irrelevant background information. Experimental results show that the TransDeep network can improve the accuracy of key categories and effectively improve the network’s segmentation accuracy of targets in images. It achieved an MIoU of 73.5% on the Pascal test set and a good performance of 79.95% on the CamVid test set. Compared with the baseline model, they improved by 2.10% and 2.61% respectively.https://ieeexplore.ieee.org/document/10820358/Image segmentationsemantic segmentationtransformer mechanism |
spellingShingle | Tengfei Chai Zhiguo Xiao Xiangfeng Shen Qian Liu NianFeng Li Tong Guan Jia Tian TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation IEEE Access Image segmentation semantic segmentation transformer mechanism |
title | TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation |
title_full | TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation |
title_fullStr | TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation |
title_full_unstemmed | TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation |
title_short | TransDeep: Transformer-Integrated DeepLabV3+ for Image Semantic Segmentation |
title_sort | transdeep transformer integrated deeplabv3 for image semantic segmentation |
topic | Image segmentation semantic segmentation transformer mechanism |
url | https://ieeexplore.ieee.org/document/10820358/ |
work_keys_str_mv | AT tengfeichai transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT zhiguoxiao transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT xiangfengshen transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT qianliu transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT nianfengli transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT tongguan transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation AT jiatian transdeeptransformerintegrateddeeplabv3forimagesemanticsegmentation |