Multisensor Diffusion-Driven Optical Image Translation for Large-Scale Applications

Comparing images captured by disparate sensors is a common challenge in remote sensing. This requires image translation—converting imagery from one sensor domain to another while preserving the original content. Denoising diffusion implicit models (DDIM) are potential state-of-the-art sol...

Full description

Saved in:
Bibliographic Details
Main Authors: Joao Gabriel Vinholi, Marco Chini, Anis Amziane, Renato Machado, Danilo Silva, Patrick Matgen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10768188/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Comparing images captured by disparate sensors is a common challenge in remote sensing. This requires image translation—converting imagery from one sensor domain to another while preserving the original content. Denoising diffusion implicit models (DDIM) are potential state-of-the-art solutions for such domain translation due to their proven superiority in multiple image-to-image translation tasks in computer vision. However, these models struggle with reproducing radiometric features of large-scale multipatch imagery, resulting in inconsistencies across the full image. This renders downstream tasks like heterogeneous change detection impractical. To overcome these limitations, we propose a method that leverages denoising diffusion for effective multisensor optical image translation over large areas. Our approach super-resolves large-scale low spatial resolution images into high-resolution equivalents from disparate optical sensors, ensuring uniformity across hundreds of patches. Our contributions lie in new forward and reverse diffusion processes that address the challenges of large-scale image translation. Extensive experiments using paired Sentinel-II (10 m) and Planet Dove (3 m) images demonstrate that our approach provides precise domain adaptation, preserving image content while improving radiometric accuracy and feature representation. A thorough image quality assessment and comparisons with the standard DDIM framework and five other leading methods are presented. We reach a mean learned perceptual image patch similarity of 0.1884 and a Fréchet Inception Distance of 45.64, expressively outperforming all compared methods, including DDIM, ShuffleMixer, and SwinIR. The usefulness of our approach is further demonstrated in two Heterogeneous Change Detection tasks.
ISSN:1939-1404
2151-1535