Ground-Based Remote Sensing Cloud Image Segmentation Using Convolution-MLP Network

Recently, multilayer perceptron (MLPs)-based methods in computer vision have attracted much attention due to the ability of learning long-range dependencies. However, MLPs-based methods usually treat all the tokens equally, which is difficult to segment challenging cloud regions. In this article, we...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shuang Liu, Jiafeng Zhang, Zhong Zhang, Shuzhen Hu, Baihua Xiao
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	Attention-guided MLPs (AGMs) parallel dilated MLPs (PDMs) remote sensing cloud image segmentation
Online Access:	https://ieeexplore.ieee.org/document/11098898/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, multilayer perceptron (MLPs)-based methods in computer vision have attracted much attention due to the ability of learning long-range dependencies. However, MLPs-based methods usually treat all the tokens equally, which is difficult to segment challenging cloud regions. In this article, we propose a novel network named convolution-MLP network (Con-MLPNet) for ground-based remote sensing cloud image segmentation, which could effectively learn long-range dependencies via the combination of MLPs and the attention mechanism. To this end, we propose the attention-guided MLPs module to highlight salient features and suppress irrelevant features from the spatial and channel aspects. Meanwhile, different from existing MLPs methods where the long-range dependencies are learned from one single scale, we propose the dilated MLPs (DMLPs) to learn long-range dependencies at different scales by sampling different channels of tokens. Furthermore, we design the parallel dilated MLPs module to integrate multiple DMLPs with different parameters in order to extract multiscale information. We conduct a series of experiments on three public ground-based cloud image segmentation datasets, i.e., TLCDD, SWIMSEG, and TCDD, and the results demonstrate that the proposed Con-MLPNet achieves state-of-the-art performance. Specifically, on the TLCDD dataset, our method surpasses the competing method across all five evaluation metrics, with the improvements of 3.3% in precision, 2.48% in recall, 3.74% in F-score, 1.76% in accuracy, and 4.0% in IoU over the second-best results.
ISSN:	1939-1404 2151-1535

Ground-Based Remote Sensing Cloud Image Segmentation Using Convolution-MLP Network

Similar Items