GCT-GF: A generative CNN-transformer for multi-modal multi-temporal gap-filling of surface water probability
Spatial and temporal data gaps present a significant challenge to high-frequency surface water mapping using satellite imagery. Utilizing observations from temporally close periods and multi-modal sensors for gap-filling is of critical importance. However, discontinuous pixel values inherent to conv...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-07-01
|
| Series: | International Journal of Applied Earth Observations and Geoinformation |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843225002432 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Spatial and temporal data gaps present a significant challenge to high-frequency surface water mapping using satellite imagery. Utilizing observations from temporally close periods and multi-modal sensors for gap-filling is of critical importance. However, discontinuous pixel values inherent to conventional water maps hinder the application of deep learning methods, which are effective and popular for relevant studies. In this study, a novel approach, termed “gap-filling of surface water probability”, is introduced to achieve seamless surface water mapping. A new fused dataset tailored for this purpose was constructed, consisting of paired synthetic aperture radar (SAR) and surface water probability data with a 10-meter spatial resolution at a 10-day interval. A Generative CNN-Transformer (GCT) for Gap-Filling (GF) of surface water probability, GCT-GF, was then proposed to integrate the strengths of convolutional neural networks (CNNs) and transformers to reconstruct gapless water probability images from multi-modal and multi-temporal data. The GCT-GF employs a coarse-to-fine structure: information from different time points is initially aggregated using a branched gated inpainting module, followed by refinement and alignment of the coarse output under target SAR guidance. For adversarial learning, a branched SN-PatchGAN discriminator is introduced to adapt to the multi-temporal input. The results show that the GCT-GF surpasses the state-of-the-art relevant methods in quantitative metrics and visual perception. The fusion of multi-modal, multi-temporal inputs obvious enhance the gap-filling performance across varying gap ratios. Applied to Baiyangdian, Poyang Lake Basin and Qinghai Lake, GCT-GF demonstrates its high reliability on large scale scenes. |
|---|---|
| ISSN: | 1569-8432 |