SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics

Synthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuntao Du, Yushi Chen, Lingbo Huang, Yahu Yang, Pedram Ghamisi, Qian Du
Format: Article
Language:English
Published: Elsevier 2025-07-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225002717
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849687957345140736
author Yuntao Du
Yushi Chen
Lingbo Huang
Yahu Yang
Pedram Ghamisi
Qian Du
author_facet Yuntao Du
Yushi Chen
Lingbo Huang
Yahu Yang
Pedram Ghamisi
Qian Du
author_sort Yuntao Du
collection DOAJ
description Synthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR foUndational Model with Multiple auxiliary tasks enhanced Intrinsic characterisTics), a foundational model tailored for SAR image understanding. SUMMIT is pre-trained on the Multi-sensor SAR Image Dataset (MuSID), which contains over 560,000 SAR images. To enhance its feature extraction capability, we introduce a masked image modeling (MIM) framework with self-supervised auxiliary tasks (SSATs): (1) MIM for learning robust structural representations, (2) self-supervised denoising to improve the model’s noise resistance, and (3) space scattering feature enhancement to preserve geometric consistency. Furthermore, we design an auxiliary task coordination module (ATCM) to balance these tasks and ensure effective feature fusion. The resulting self-supervised framework enables SUMMIT to integrate deep learning with SAR’s physical attributes effectively. Extensive experiments across seven datasets and three downstream tasks demonstrate that SUMMIT achieves state-of-the-art performance, particularly in SAR classification, detection, and segmentation. Code and pre-trained model of the proposed SUMMIT will be available at https://github.com/Yunsans/SUMMIT.
format Article
id doaj-art-65c6c4310ff94207a5dccaa1b021a527
institution DOAJ
issn 1569-8432
language English
publishDate 2025-07-01
publisher Elsevier
record_format Article
series International Journal of Applied Earth Observations and Geoinformation
spelling doaj-art-65c6c4310ff94207a5dccaa1b021a5272025-08-20T03:22:11ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322025-07-0114110462410.1016/j.jag.2025.104624SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristicsYuntao Du0Yushi Chen1Lingbo Huang2Yahu Yang3Pedram Ghamisi4Qian Du5School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, ChinaSchool of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China; Corresponding author.School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, ChinaCenter for Control Theory and Guidance Technology Research, Harbin Institute of Technology, Harbin, 150001, ChinaHelmholtz-Zentrum Dresden-Rossendorf (HZDR), Helmholtz Institute Freiberg for Resource Technology, Freiberg, 09599, GermanyDepartment of Electrical and Computer Engineering, Mississippi State University, Starkville, MS 39762, USASynthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR foUndational Model with Multiple auxiliary tasks enhanced Intrinsic characterisTics), a foundational model tailored for SAR image understanding. SUMMIT is pre-trained on the Multi-sensor SAR Image Dataset (MuSID), which contains over 560,000 SAR images. To enhance its feature extraction capability, we introduce a masked image modeling (MIM) framework with self-supervised auxiliary tasks (SSATs): (1) MIM for learning robust structural representations, (2) self-supervised denoising to improve the model’s noise resistance, and (3) space scattering feature enhancement to preserve geometric consistency. Furthermore, we design an auxiliary task coordination module (ATCM) to balance these tasks and ensure effective feature fusion. The resulting self-supervised framework enables SUMMIT to integrate deep learning with SAR’s physical attributes effectively. Extensive experiments across seven datasets and three downstream tasks demonstrate that SUMMIT achieves state-of-the-art performance, particularly in SAR classification, detection, and segmentation. Code and pre-trained model of the proposed SUMMIT will be available at https://github.com/Yunsans/SUMMIT.http://www.sciencedirect.com/science/article/pii/S1569843225002717Synthetic Aperture RadarFoundation modelSelf-supervised auxiliary taskVision Transformer
spellingShingle Yuntao Du
Yushi Chen
Lingbo Huang
Yahu Yang
Pedram Ghamisi
Qian Du
SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
International Journal of Applied Earth Observations and Geoinformation
Synthetic Aperture Radar
Foundation model
Self-supervised auxiliary task
Vision Transformer
title SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
title_full SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
title_fullStr SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
title_full_unstemmed SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
title_short SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
title_sort summit a sar foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
topic Synthetic Aperture Radar
Foundation model
Self-supervised auxiliary task
Vision Transformer
url http://www.sciencedirect.com/science/article/pii/S1569843225002717
work_keys_str_mv AT yuntaodu summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics
AT yushichen summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics
AT lingbohuang summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics
AT yahuyang summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics
AT pedramghamisi summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics
AT qiandu summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics