MAMNet: Lightweight Multi-Attention Collaborative Network for Fine-Grained Cropland Extraction from Gaofen-2 Remote Sensing Imagery
To address the issues of high computational complexity and boundary feature loss encountered when extracting farmland information from high-resolution remote sensing images, this study proposes an innovative CNN–Transformer hybrid network, MAMNet. This framework integrates a lightweight encoder, a g...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Agriculture |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2077-0472/15/11/1152 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | To address the issues of high computational complexity and boundary feature loss encountered when extracting farmland information from high-resolution remote sensing images, this study proposes an innovative CNN–Transformer hybrid network, MAMNet. This framework integrates a lightweight encoder, a global–local Transformer decoder, and a bidirectional attention architecture to achieve efficient and accurate farmland information extraction. First, we reconstruct the ResNet-18 backbone network using deep separable convolutions, reducing computational complexity while preserving feature representation capabilities. Second, the global–local Transformer block (GLTB) decoder uses multi-head self-attention mechanisms to dynamically fuse multi-scale features across layers, effectively restoring the topological structure of fragmented farmland boundaries. Third, we propose a novel bidirectional attention architecture: the Detail Improvement Module (DIM) uses channel attention to transfer semantic features to geometric features. The Context Enhancement Module (CEM) utilizes spatial attention to achieve dynamic geometric–semantic fusion, quantitatively distinguishing farmland textures from mixed ground cover. The positional attention mechanism (PAM) enhances the continuity of linear features by strengthening spatial correlations in jump connections. By cascading front-end feature module (FEM) to expand the receptive field and combining an adaptive feature reconstruction head (FRH), this method improves information integrity in fragmented areas. Evaluation results on the 2022 high-resolution two-channel image dataset from Chenggong District, Kunming City, demonstrate that MAMNet achieves an mIoU of 86.68% (an improvement of 1.66% and 2.44% over UNetFormer and BANet, respectively) and an F1-Score of 92.86% with only 12 million parameters. This method provides new technical insights for plot-level farmland monitoring in precision agriculture. |
|---|---|
| ISSN: | 2077-0472 |