A high-resolution remote sensing land use/land cover classification method based on multi-level features adaptation of segment anything model

Land use/land cover (LULC) classification based on deep learning techniques is a significant research area for analyzing high-resolution remote sensing(HRRS) images. However, due to the limitation of available samples and model feature extraction capability, the current deep learning methods suffer...

Full description

Saved in:
Bibliographic Details
Main Authors: Hui Yang, Zhipeng Jiang, Yaobo Zhang, Yanlan Wu, Heng Luo, Peng Zhang, Biao Wang
Format: Article
Language:English
Published: Elsevier 2025-07-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225003061
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Land use/land cover (LULC) classification based on deep learning techniques is a significant research area for analyzing high-resolution remote sensing(HRRS) images. However, due to the limitation of available samples and model feature extraction capability, the current deep learning methods suffer from weak generalization ability for widespread and effective application across diverse HRRS scenarios. To address this problem, we propose an innovative network model named multi-level feature adaptation-segment anything Model (MLFA-SAM). The model employs a three-level fine-tuning strategy to adapt the SAM foundation model for remote sensing LULC classification. The proposed MLFA-SAM significantly enhances high-precision classification performance across diverse HRRS scenarios. Specifically, the domain distribution shift adaptation (DDSA) level is designed to adjust the input image modality for SAM and initially extract features and overcome the domain distribution shift between remote sensing images and the natural images used by the SAM. Then, we designed depthwise low-rank adaptation (DLRA) strategy to optimally fine-tune the frozen SAM parameters. Finally, we improved SAM’s mask decoder to generate high-quality multi-class masks required for LULC classification. Experimental results demonstrate that the MLFA-SAM model surpasses several existing state-of-the-art(SOTA) methods on the HRLC dataset and the ISPRS Potsdam dataset. Quantitative evaluations demonstrate that MLFA-SAM, with its concise yet efficient architecture, achieves 66.77% mIoU and 86.02% OA on the HRLC dataset. Notably, the integration of near-infrared (Nir) bands further enhances its performance to 68.43% mIoU and 87.91% OA. The generalization test on the LoveDA dataset, along with four test HRRS images exhibiting spatiotemporal and semantic scene differences, further demonstrate that MLFA-SAM possesses a stronger generalization ability compared to existing methods and shows greater potential for practical applications.
ISSN:1569-8432