Global Optical and SAR Image Registration Method Based on Local Distortion Division

Variations in terrain elevation cause images acquired under different imaging modalities to deviate from a linear mapping relationship. This effect is particularly pronounced between optical and SAR images, where the range-based imaging mechanism of SAR sensors leads to significant local geometric d...

Full description

Saved in:
Bibliographic Details
Main Authors: Bangjie Li, Dongdong Guan, Yuzhen Xie, Xiaolong Zheng, Zhengsheng Chen, Lefei Pan, Weiheng Zhao, Deliang Xiang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/9/1642
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Variations in terrain elevation cause images acquired under different imaging modalities to deviate from a linear mapping relationship. This effect is particularly pronounced between optical and SAR images, where the range-based imaging mechanism of SAR sensors leads to significant local geometric distortions, such as perspective shrinkage and occlusion. As a result, it becomes difficult to represent the spatial correspondence between optical and SAR images using a single geometric model. To address this challenge, we propose a global optical-SAR image registration method that leverages local distortion characteristics. Specifically, we introduce a Superpixel-based Local Distortion Division (SLDD) method, which defines superpixel region features and segments the image into local distortion and normal regions by computing the Mahalanobis distance between superpixel features. We further design a Multi-Feature Fusion Capsule Network (MFFCN) that integrates shallow salient features with deep structural details, reconstructing the dimensions of digital capsules to generate feature descriptors encompassing texture, phase, structure, and amplitude information. This design effectively mitigates the information loss and feature degradation problems caused by pooling operations in conventional convolutional neural networks (CNNs). Additionally, a hard negative mining loss is incorporated to further enhance feature discriminability. Feature descriptors are extracted separately from regions with different distortion levels, and corresponding transformation models are built for local registration. Finally, the local registration results are fused to generate a globally aligned image. Experimental results on public datasets demonstrate that the proposed method achieves superior performance over state-of-the-art (SOTA) approaches in terms of Root Mean Squared Error (RMSE), Correct Match Number (CMN), Distribution of Matched Points (Scat), Edge Fidelity (EF), and overall visual quality.
ISSN:2072-4292