Cross-Temporal Remote Sensing Image Change Captioning: A Manifold Mapping and Bayesian Diffusion Approach for Land Use Monitoring

This study proposes a cross-temporal remote sensing image change captioning (RSICC) model named CTM, which is constructed based on manifold mapping and Bayesian diffusion techniques. The primary objective of CTM is to enhance the accuracy and robustness of captioning changes in multitemporal remote...

Full description

Saved in:
Bibliographic Details
Main Authors: Qingshan Bai, Xiaohua Wang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11021286/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study proposes a cross-temporal remote sensing image change captioning (RSICC) model named CTM, which is constructed based on manifold mapping and Bayesian diffusion techniques. The primary objective of CTM is to enhance the accuracy and robustness of captioning changes in multitemporal remote sensing images (RSIs). The model first employs manifold mapping to model illumination variations, reducing the impact of seasonal and lighting factors on image consistency. Subsequently, Bayesian diffusion is introduced to improve the modeling capability of cross-temporal image changes, enhancing robustness against noise and pseudo-changes. In addition, a dual-layer multicoding module is adopted to strengthen temporal feature representation, improving the perception of change regions. Finally, a difference enhancement and dual-attention based image-text captioning strategy is proposed to optimize feature selection and enhance the accuracy and detail of textual descriptions. Experimental results demonstrate that CTM exhibits greater robustness in handling long-span RSIs, effectively mitigating pseudo-changes caused by illumination and seasonal variations. On the LEVIR-CC dataset, CTM achieves a CIDEr score of 138.78, outperforming the best existing method by 7.38 points. On the WHU-CDC dataset, CTM achieves the highest performance in BLEU and METEOR metrics, with a CIDEr score of 153.29, showcasing its outstanding performance in RSICC tasks. Furthermore, visual analysis indicates that CTM accurately captures real change regions while significantly suppressing pseudo-changes, maintaining high descriptive accuracy even in complex environments. This study provides an efficient and precise solution for applications such as land use monitoring, environmental monitoring, and disaster response.
ISSN:1939-1404
2151-1535