GCT-GF: A generative CNN-transformer for multi-modal multi-temporal gap-filling of surface water probability

Spatial and temporal data gaps present a significant challenge to high-frequency surface water mapping using satellite imagery. Utilizing observations from temporally close periods and multi-modal sensors for gap-filling is of critical importance. However, discontinuous pixel values inherent to conv...

Full description

Saved in:
Bibliographic Details
Main Authors: Yanjiao Song, Linyi Li, Yun Chen, Junjie Li, Zhe Wang, Zhen Zhang, Xi Wang, Wen Zhang, Lingkui Meng
Format: Article
Language:English
Published: Elsevier 2025-07-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225002432
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spatial and temporal data gaps present a significant challenge to high-frequency surface water mapping using satellite imagery. Utilizing observations from temporally close periods and multi-modal sensors for gap-filling is of critical importance. However, discontinuous pixel values inherent to conventional water maps hinder the application of deep learning methods, which are effective and popular for relevant studies. In this study, a novel approach, termed “gap-filling of surface water probability”, is introduced to achieve seamless surface water mapping. A new fused dataset tailored for this purpose was constructed, consisting of paired synthetic aperture radar (SAR) and surface water probability data with a 10-meter spatial resolution at a 10-day interval. A Generative CNN-Transformer (GCT) for Gap-Filling (GF) of surface water probability, GCT-GF, was then proposed to integrate the strengths of convolutional neural networks (CNNs) and transformers to reconstruct gapless water probability images from multi-modal and multi-temporal data. The GCT-GF employs a coarse-to-fine structure: information from different time points is initially aggregated using a branched gated inpainting module, followed by refinement and alignment of the coarse output under target SAR guidance. For adversarial learning, a branched SN-PatchGAN discriminator is introduced to adapt to the multi-temporal input. The results show that the GCT-GF surpasses the state-of-the-art relevant methods in quantitative metrics and visual perception. The fusion of multi-modal, multi-temporal inputs obvious enhance the gap-filling performance across varying gap ratios. Applied to Baiyangdian, Poyang Lake Basin and Qinghai Lake, GCT-GF demonstrates its high reliability on large scale scenes.
ISSN:1569-8432