Semantic segmentation of underwater images based on the improved SegFormer
Underwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segm...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-03-01
|
| Series: | Frontiers in Marine Science |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850069462460399616 |
|---|---|
| author | Bowei Chen Bowei Chen Wei Zhao Wei Zhao Qiusheng Zhang Mingliang Li Mingyang Qi You Tang You Tang You Tang |
| author_facet | Bowei Chen Bowei Chen Wei Zhao Wei Zhao Qiusheng Zhang Mingliang Li Mingyang Qi You Tang You Tang You Tang |
| author_sort | Bowei Chen |
| collection | DOAJ |
| description | Underwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segmentation tasks. To address these issues, this study presents a high-performance semantic segmentation approach for underwater images based on the standard SegFormer model. First, the Mix Transformer backbone in SegFormer is replaced with a Swin Transformer to enhance feature extraction and facilitate efficient acquisition of global context information. Next, the Efficient Multi-scale Attention (EMA) mechanism is introduced in the backbone’s downsampling stages and the decoder to better capture multi-scale features, further improving segmentation accuracy. Furthermore, a Feature Pyramid Network (FPN) structure is incorporated into the decoder to combine feature maps at multiple resolutions, allowing the model to integrate contextual information effectively, enhancing robustness in complex underwater environments. Testing on the SUIM underwater image dataset shows that the proposed model achieves high performance across multiple metrics: mean Intersection over Union (MIoU) of 77.00%, mean Recall (mRecall) of 85.04%, mean Precision (mPrecision) of 89.03%, and mean F1score (mF1score) of 86.63%. Compared to the standard SegFormer, it demonstrates improvements of 3.73% in MIoU, 1.98% in mRecall, 3.38% in mPrecision, and 2.44% in mF1score, with an increase of 9.89M parameters. The results demonstrate that the proposed method achieves superior segmentation accuracy with minimal additional computation, showcasing high performance in underwater image segmentation. |
| format | Article |
| id | doaj-art-23362b46917d40cd8fc6a2076d74156e |
| institution | DOAJ |
| issn | 2296-7745 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Marine Science |
| spelling | doaj-art-23362b46917d40cd8fc6a2076d74156e2025-08-20T02:47:46ZengFrontiers Media S.A.Frontiers in Marine Science2296-77452025-03-011210.3389/fmars.2025.15221601522160Semantic segmentation of underwater images based on the improved SegFormerBowei Chen0Bowei Chen1Wei Zhao2Wei Zhao3Qiusheng Zhang4Mingliang Li5Mingyang Qi6You Tang7You Tang8You Tang9Qingdao Innovation and Development Base, Harbin, ChinaLaboratory of Underwater Intelligence, Harbin Engineering University, Qingdao, ChinaQingdao Innovation and Development Base, Harbin, ChinaLaboratory of Underwater Intelligence, Harbin Engineering University, Qingdao, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaElectrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, ChinaCollege of Information Technology, Jilin Agricultural University, Changchun, ChinaCollege of Agriculture, Yanbian University, Yanji, ChinaUnderwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segmentation tasks. To address these issues, this study presents a high-performance semantic segmentation approach for underwater images based on the standard SegFormer model. First, the Mix Transformer backbone in SegFormer is replaced with a Swin Transformer to enhance feature extraction and facilitate efficient acquisition of global context information. Next, the Efficient Multi-scale Attention (EMA) mechanism is introduced in the backbone’s downsampling stages and the decoder to better capture multi-scale features, further improving segmentation accuracy. Furthermore, a Feature Pyramid Network (FPN) structure is incorporated into the decoder to combine feature maps at multiple resolutions, allowing the model to integrate contextual information effectively, enhancing robustness in complex underwater environments. Testing on the SUIM underwater image dataset shows that the proposed model achieves high performance across multiple metrics: mean Intersection over Union (MIoU) of 77.00%, mean Recall (mRecall) of 85.04%, mean Precision (mPrecision) of 89.03%, and mean F1score (mF1score) of 86.63%. Compared to the standard SegFormer, it demonstrates improvements of 3.73% in MIoU, 1.98% in mRecall, 3.38% in mPrecision, and 2.44% in mF1score, with an increase of 9.89M parameters. The results demonstrate that the proposed method achieves superior segmentation accuracy with minimal additional computation, showcasing high performance in underwater image segmentation.https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/fullunderwater imagessemantic segmentationattention mechanismfeature fusionSegFormer |
| spellingShingle | Bowei Chen Bowei Chen Wei Zhao Wei Zhao Qiusheng Zhang Mingliang Li Mingyang Qi You Tang You Tang You Tang Semantic segmentation of underwater images based on the improved SegFormer Frontiers in Marine Science underwater images semantic segmentation attention mechanism feature fusion SegFormer |
| title | Semantic segmentation of underwater images based on the improved SegFormer |
| title_full | Semantic segmentation of underwater images based on the improved SegFormer |
| title_fullStr | Semantic segmentation of underwater images based on the improved SegFormer |
| title_full_unstemmed | Semantic segmentation of underwater images based on the improved SegFormer |
| title_short | Semantic segmentation of underwater images based on the improved SegFormer |
| title_sort | semantic segmentation of underwater images based on the improved segformer |
| topic | underwater images semantic segmentation attention mechanism feature fusion SegFormer |
| url | https://www.frontiersin.org/articles/10.3389/fmars.2025.1522160/full |
| work_keys_str_mv | AT boweichen semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT boweichen semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT weizhao semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT weizhao semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT qiushengzhang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT mingliangli semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT mingyangqi semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer AT youtang semanticsegmentationofunderwaterimagesbasedontheimprovedsegformer |