A Novel Change Detection Method Based on Visual Language From High-Resolution Remote Sensing Images

Recently, the release of “all-in-one” foundation models has sparked rapid developments in artificial intelligence. However, due to the fact that these models are typically trained on natural images, their potential in remote sensing remains largely untapped. To address this gap...

Full description

Saved in:
Bibliographic Details
Main Authors: Junlong Qiu, Wei Liu, Hui Zhang, Erzhu Li, Lianpeng Zhang, Xing Li
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10818767/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, the release of &#x201C;all-in-one&#x201D; foundation models has sparked rapid developments in artificial intelligence. However, due to the fact that these models are typically trained on natural images, their potential in remote sensing remains largely untapped. To address this gap, this article proposes a novel change detection method based on visual language from high-resolution remote sensing images, named VLCD. Specifically, on the text side, we use context optimization to align text&#x2013;image semantics. On the image side, we construct a side fusion network, which integrates universal features from the foundation model with domain-specific features from remote sensing through a bridging module. In addition, we introduce a change feature computation module to integrate global features, difference features, and textual information. To validate the effectiveness of the proposed method, we conducted comparative experiments on three public datasets. The results show that the proposed VLCD achieved state-of-the-art <italic>F</italic>1-scores and IoUs on these three datasets: LEVIR-CD (90.99&#x0025;, 83.46&#x0025;), SYSU-CD (83.05&#x0025;, 71.01&#x0025;), and S2Looking (62.75&#x0025;, 45.89&#x0025;), outperforming the results obtained through full fine-tuning while using less than one-tenth of the number of parameters.
ISSN:1939-1404
2151-1535