Semantic segmentation of underwater images based on the improved SegFormer

Underwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segm...

Full description

Saved in:
Bibliographic Details
Main Authors: Bowei Chen, Wei Zhao, Qiusheng Zhang, Mingliang Li, Mingyang Qi, You Tang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-03-01
Series:Frontiers in Marine Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850069462460399616
author Bowei Chen
Bowei Chen
Wei Zhao
Wei Zhao
Qiusheng Zhang
Mingliang Li
Mingyang Qi
You Tang
You Tang
You Tang
author_facet Bowei Chen
Bowei Chen
Wei Zhao
Wei Zhao
Qiusheng Zhang
Mingliang Li
Mingyang Qi
You Tang
You Tang
You Tang
author_sort Bowei Chen
collection DOAJ
description Underwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segmentation tasks. To address these issues, this study presents a high-performance semantic segmentation approach for underwater images based on the standard SegFormer model. First, the Mix Transformer backbone in SegFormer is replaced with a Swin Transformer to enhance feature extraction and facilitate efficient acquisition of global context information. Next, the Efficient Multi-scale Attention (EMA) mechanism is introduced in the backbone’s downsampling stages and the decoder to better capture multi-scale features, further improving segmentation accuracy. Furthermore, a Feature Pyramid Network (FPN) structure is incorporated into the decoder to combine feature maps at multiple resolutions, allowing the model to integrate contextual information effectively, enhancing robustness in complex underwater environments. Testing on the SUIM underwater image dataset shows that the proposed model achieves high performance across multiple metrics: mean Intersection over Union (MIoU) of 77.00%, mean Recall (mRecall) of 85.04%, mean Precision (mPrecision) of 89.03%, and mean F1score (mF1score) of 86.63%. Compared to the standard SegFormer, it demonstrates improvements of 3.73% in MIoU, 1.98% in mRecall, 3.38% in mPrecision, and 2.44% in mF1score, with an increase of 9.89M parameters. The results demonstrate that the proposed method achieves superior segmentation accuracy with minimal additional computation, showcasing high performance in underwater image segmentation.
format Article
id doaj-art-23362b46917d40cd8fc6a2076d74156e
institution DOAJ
issn 2296-7745
language English
publishDate 2025-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Marine Science
spelling doaj-art-23362b46917d40cd8fc6a2076d74156e2025-08-20T02:47:46ZengFrontiers Media S.A.Frontiers in Marine Science2296-77452025-03-011210.3389/fmars.2025.15221601522160Semantic segmentation of underwater images based on the improved SegFormerBowei Chen0Bowei Chen1Wei Zhao2Wei Zhao3Qiusheng Zhang4Mingliang Li5Mingyang Qi6You Tang7You Tang8You Tang9Qingdao Innovation and Development Base, Harbin, ChinaLaboratory of Underwater Intelligence, Harbin Engineering University, Qingdao, ChinaQingdao Innovation and Development Base, Harbin, ChinaLaboratory of Underwater Intelligence, Harbin Engineering University, Qingdao, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaCollege of Information Technology, Jilin Agricultural University, Changchun, ChinaCollege of Agriculture, Yanbian University, Yanji, ChinaUnderwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segmentation tasks. To address these issues, this study presents a high-performance semantic segmentation approach for underwater images based on the standard SegFormer model. First, the Mix Transformer backbone in SegFormer is replaced with a Swin Transformer to enhance feature extraction and facilitate efficient acquisition of global context information. Next, the Efficient Multi-scale Attention (EMA) mechanism is introduced in the backbone’s downsampling stages and the decoder to better capture multi-scale features, further improving segmentation accuracy. Furthermore, a Feature Pyramid Network (FPN) structure is incorporated into the decoder to combine feature maps at multiple resolutions, allowing the model to integrate contextual information effectively, enhancing robustness in complex underwater environments. Testing on the SUIM underwater image dataset shows that the proposed model achieves high performance across multiple metrics: mean Intersection over Union (MIoU) of 77.00%, mean Recall (mRecall) of 85.04%, mean Precision (mPrecision) of 89.03%, and mean F1score (mF1score) of 86.63%. Compared to the standard SegFormer, it demonstrates improvements of 3.73% in MIoU, 1.98% in mRecall, 3.38% in mPrecision, and 2.44% in mF1score, with an increase of 9.89M parameters. The results demonstrate that the proposed method achieves superior segmentation accuracy with minimal additional computation, showcasing high performance in underwater image segmentation.https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/fullunderwater imagessemantic segmentationattention mechanismfeature fusionSegFormer
spellingShingle Bowei Chen
Bowei Chen
Wei Zhao
Wei Zhao
Qiusheng Zhang
Mingliang Li
Mingyang Qi
You Tang
You Tang
You Tang
Semantic segmentation of underwater images based on the improved SegFormer
Frontiers in Marine Science
underwater images
semantic segmentation
attention mechanism
feature fusion
SegFormer
title Semantic segmentation of underwater images based on the improved SegFormer
title_full Semantic segmentation of underwater images based on the improved SegFormer
title_fullStr Semantic segmentation of underwater images based on the improved SegFormer
title_full_unstemmed Semantic segmentation of underwater images based on the improved SegFormer
title_short Semantic segmentation of underwater images based on the improved SegFormer
title_sort semantic segmentation of underwater images based on the improved segformer
topic underwater images
semantic segmentation
attention mechanism
feature fusion
SegFormer
url https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/full
work_keys_str_mv AT boweichen semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT boweichen semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT weizhao semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT weizhao semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT qiushengzhang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT mingliangli semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT mingyangqi semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer
AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer