FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images
Semantic segmentation of urban remote sensing images is a very challenging task. Due to the complex background, occlusion overlap and small scale target of urban remote sensing image, the semantic segmentation results have some defects such as target confusion and similarity, target boundary ambigui...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10966862/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849712572193832960 |
|---|---|
| author | Tianren Wu Wenqin Deng Rui Lin Junzhe Jiang Xueyun Chen |
| author_facet | Tianren Wu Wenqin Deng Rui Lin Junzhe Jiang Xueyun Chen |
| author_sort | Tianren Wu |
| collection | DOAJ |
| description | Semantic segmentation of urban remote sensing images is a very challenging task. Due to the complex background, occlusion overlap and small scale target of urban remote sensing image, the semantic segmentation results have some defects such as target confusion and similarity, target boundary ambiguity, and small scale target omission. To solve the above problems, a feature-interactive fusion and multi-scale detail sensing lightweight enhanced Swin Transformer (FML-Swin) is proposed. The model includes several key components: feature interactive fusion transformer (FIFT) module, which enhances the model’s focus on current channel features; multi-scale detail sensing (MSDS) module, specifically designed to capture small scale features and details in remote sensing images; and lightweight enhanced squeeze excitation (LESE) module, which enriches the semantic feature information contained in the input image while maintaining a lightweight design. With limited training rounds, the model achieves a mIoU accuracy of 78.58 on the multi-class semantic segmentation task of the Potsdam dataset, exceeding SegNeXt 0.49. In addition, on the multi-class semantic segmentation task of the Vaihingen dataset, the mIoU accuracy of the model is 74.75, which is higher than SegNeXt 0.17. These results demonstrate the validity of the model. |
| format | Article |
| id | doaj-art-9e0e723dd5ea4d13882df2d44b540f59 |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-9e0e723dd5ea4d13882df2d44b540f592025-08-20T03:14:13ZengIEEEIEEE Access2169-35362025-01-0113669316694310.1109/ACCESS.2025.356132510966862FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing ImagesTianren Wu0https://orcid.org/0009-0000-9718-8289Wenqin Deng1https://orcid.org/0009-0006-6643-2182Rui Lin2https://orcid.org/0009-0002-3452-9757Junzhe Jiang3https://orcid.org/0009-0004-4686-7520Xueyun Chen4https://orcid.org/0000-0002-7452-0223School of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSemantic segmentation of urban remote sensing images is a very challenging task. Due to the complex background, occlusion overlap and small scale target of urban remote sensing image, the semantic segmentation results have some defects such as target confusion and similarity, target boundary ambiguity, and small scale target omission. To solve the above problems, a feature-interactive fusion and multi-scale detail sensing lightweight enhanced Swin Transformer (FML-Swin) is proposed. The model includes several key components: feature interactive fusion transformer (FIFT) module, which enhances the model’s focus on current channel features; multi-scale detail sensing (MSDS) module, specifically designed to capture small scale features and details in remote sensing images; and lightweight enhanced squeeze excitation (LESE) module, which enriches the semantic feature information contained in the input image while maintaining a lightweight design. With limited training rounds, the model achieves a mIoU accuracy of 78.58 on the multi-class semantic segmentation task of the Potsdam dataset, exceeding SegNeXt 0.49. In addition, on the multi-class semantic segmentation task of the Vaihingen dataset, the mIoU accuracy of the model is 74.75, which is higher than SegNeXt 0.17. These results demonstrate the validity of the model.https://ieeexplore.ieee.org/document/10966862/Swin transformerremote sensing imagessemantic segmentationfeature interactive fusionmulti-scale detail sensing |
| spellingShingle | Tianren Wu Wenqin Deng Rui Lin Junzhe Jiang Xueyun Chen FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images IEEE Access Swin transformer remote sensing images semantic segmentation feature interactive fusion multi-scale detail sensing |
| title | FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images |
| title_full | FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images |
| title_fullStr | FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images |
| title_full_unstemmed | FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images |
| title_short | FML-Swin: An Improved Swin Transformer Segmentor for Remote Sensing Images |
| title_sort | fml swin an improved swin transformer segmentor for remote sensing images |
| topic | Swin transformer remote sensing images semantic segmentation feature interactive fusion multi-scale detail sensing |
| url | https://ieeexplore.ieee.org/document/10966862/ |
| work_keys_str_mv | AT tianrenwu fmlswinanimprovedswintransformersegmentorforremotesensingimages AT wenqindeng fmlswinanimprovedswintransformersegmentorforremotesensingimages AT ruilin fmlswinanimprovedswintransformersegmentorforremotesensingimages AT junzhejiang fmlswinanimprovedswintransformersegmentorforremotesensingimages AT xueyunchen fmlswinanimprovedswintransformersegmentorforremotesensingimages |