Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study

Deep learning methods, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges du...

Full description

Saved in:
Bibliographic Details
Main Authors: Qinfeng Zhu, Yuan Fang, Yuanzhi Cai, Cheng Chen, Lei Fan
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10703181/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850023248528408576
author Qinfeng Zhu
Yuan Fang
Yuanzhi Cai
Cheng Chen
Lei Fan
author_facet Qinfeng Zhu
Yuan Fang
Yuanzhi Cai
Cheng Chen
Lei Fan
author_sort Qinfeng Zhu
collection DOAJ
description Deep learning methods, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges due to their quadratic complexity. Recently, the Mamba model, featuring linear complexity and a global receptive field, has gained extensive attention for vision tasks. In such tasks, images need to be serialized to form sequences compatible with the Mamba model. Numerous research efforts have explored scanning strategies to serialize images, aiming to enhance the Mamba model's understanding of images. However, the effectiveness of these scanning strategies remains uncertain. In this research, we conduct a comprehensive experimental investigation on the impact of mainstream scanning directions and their combinations on semantic segmentation of remotely sensed images. Through extensive experiments on the LoveDA, ISPRS Potsdam, ISPRS Vaihingen, and UAVid datasets, we demonstrate that no single scanning strategy outperforms others, regardless of their complexity or the number of scanning directions involved. A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images. Relevant directions for future research are also recommended.
format Article
id doaj-art-b3e5cb75ee254917a29ca548c7ce8694
institution DOAJ
issn 1939-1404
2151-1535
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-b3e5cb75ee254917a29ca548c7ce86942025-08-20T03:01:27ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352024-01-0117182231823410.1109/JSTARS.2024.347229610703181Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental StudyQinfeng Zhu0https://orcid.org/0009-0002-4847-3555Yuan Fang1https://orcid.org/0000-0003-0531-6066Yuanzhi Cai2https://orcid.org/0000-0002-7005-5870Cheng Chen3https://orcid.org/0000-0002-1989-8602Lei Fan4https://orcid.org/0000-0002-5538-4684Department of Civil Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, ChinaDepartment of Civil Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, ChinaCSIRO Mineral Resources, Kensington, WA, AustraliaDepartment of Civil Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, ChinaDepartment of Civil Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, ChinaDeep learning methods, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges due to their quadratic complexity. Recently, the Mamba model, featuring linear complexity and a global receptive field, has gained extensive attention for vision tasks. In such tasks, images need to be serialized to form sequences compatible with the Mamba model. Numerous research efforts have explored scanning strategies to serialize images, aiming to enhance the Mamba model's understanding of images. However, the effectiveness of these scanning strategies remains uncertain. In this research, we conduct a comprehensive experimental investigation on the impact of mainstream scanning directions and their combinations on semantic segmentation of remotely sensed images. Through extensive experiments on the LoveDA, ISPRS Potsdam, ISPRS Vaihingen, and UAVid datasets, we demonstrate that no single scanning strategy outperforms others, regardless of their complexity or the number of scanning directions involved. A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images. Relevant directions for future research are also recommended.https://ieeexplore.ieee.org/document/10703181/ImageMambaremote sensingscanning strategiessegmentationsemantic
spellingShingle Qinfeng Zhu
Yuan Fang
Yuanzhi Cai
Cheng Chen
Lei Fan
Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Image
Mamba
remote sensing
scanning strategies
segmentation
semantic
title Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
title_full Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
title_fullStr Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
title_full_unstemmed Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
title_short Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
title_sort rethinking scanning strategies with vision mamba in semantic segmentation of remote sensing imagery an experimental study
topic Image
Mamba
remote sensing
scanning strategies
segmentation
semantic
url https://ieeexplore.ieee.org/document/10703181/
work_keys_str_mv AT qinfengzhu rethinkingscanningstrategieswithvisionmambainsemanticsegmentationofremotesensingimageryanexperimentalstudy
AT yuanfang rethinkingscanningstrategieswithvisionmambainsemanticsegmentationofremotesensingimageryanexperimentalstudy
AT yuanzhicai rethinkingscanningstrategieswithvisionmambainsemanticsegmentationofremotesensingimageryanexperimentalstudy
AT chengchen rethinkingscanningstrategieswithvisionmambainsemanticsegmentationofremotesensingimageryanexperimentalstudy
AT leifan rethinkingscanningstrategieswithvisionmambainsemanticsegmentationofremotesensingimageryanexperimentalstudy