SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics
Synthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-07-01
|
| Series: | International Journal of Applied Earth Observations and Geoinformation |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843225002717 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849687957345140736 |
|---|---|
| author | Yuntao Du Yushi Chen Lingbo Huang Yahu Yang Pedram Ghamisi Qian Du |
| author_facet | Yuntao Du Yushi Chen Lingbo Huang Yahu Yang Pedram Ghamisi Qian Du |
| author_sort | Yuntao Du |
| collection | DOAJ |
| description | Synthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR foUndational Model with Multiple auxiliary tasks enhanced Intrinsic characterisTics), a foundational model tailored for SAR image understanding. SUMMIT is pre-trained on the Multi-sensor SAR Image Dataset (MuSID), which contains over 560,000 SAR images. To enhance its feature extraction capability, we introduce a masked image modeling (MIM) framework with self-supervised auxiliary tasks (SSATs): (1) MIM for learning robust structural representations, (2) self-supervised denoising to improve the model’s noise resistance, and (3) space scattering feature enhancement to preserve geometric consistency. Furthermore, we design an auxiliary task coordination module (ATCM) to balance these tasks and ensure effective feature fusion. The resulting self-supervised framework enables SUMMIT to integrate deep learning with SAR’s physical attributes effectively. Extensive experiments across seven datasets and three downstream tasks demonstrate that SUMMIT achieves state-of-the-art performance, particularly in SAR classification, detection, and segmentation. Code and pre-trained model of the proposed SUMMIT will be available at https://github.com/Yunsans/SUMMIT. |
| format | Article |
| id | doaj-art-65c6c4310ff94207a5dccaa1b021a527 |
| institution | DOAJ |
| issn | 1569-8432 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Elsevier |
| record_format | Article |
| series | International Journal of Applied Earth Observations and Geoinformation |
| spelling | doaj-art-65c6c4310ff94207a5dccaa1b021a5272025-08-20T03:22:11ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322025-07-0114110462410.1016/j.jag.2025.104624SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristicsYuntao Du0Yushi Chen1Lingbo Huang2Yahu Yang3Pedram Ghamisi4Qian Du5School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, ChinaSchool of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China; Corresponding author.School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, ChinaCenter for Control Theory and Guidance Technology Research, Harbin Institute of Technology, Harbin, 150001, ChinaHelmholtz-Zentrum Dresden-Rossendorf (HZDR), Helmholtz Institute Freiberg for Resource Technology, Freiberg, 09599, GermanyDepartment of Electrical and Computer Engineering, Mississippi State University, Starkville, MS 39762, USASynthetic Aperture Radar (SAR) is a crucial tool in remote sensing, yet existing deep learning methods are primarily limited in visual representation, neglecting the intrinsic characteristics of SAR and the need for strong generalization across multiple tasks. To address this, we propose SUMMIT (SAR foUndational Model with Multiple auxiliary tasks enhanced Intrinsic characterisTics), a foundational model tailored for SAR image understanding. SUMMIT is pre-trained on the Multi-sensor SAR Image Dataset (MuSID), which contains over 560,000 SAR images. To enhance its feature extraction capability, we introduce a masked image modeling (MIM) framework with self-supervised auxiliary tasks (SSATs): (1) MIM for learning robust structural representations, (2) self-supervised denoising to improve the model’s noise resistance, and (3) space scattering feature enhancement to preserve geometric consistency. Furthermore, we design an auxiliary task coordination module (ATCM) to balance these tasks and ensure effective feature fusion. The resulting self-supervised framework enables SUMMIT to integrate deep learning with SAR’s physical attributes effectively. Extensive experiments across seven datasets and three downstream tasks demonstrate that SUMMIT achieves state-of-the-art performance, particularly in SAR classification, detection, and segmentation. Code and pre-trained model of the proposed SUMMIT will be available at https://github.com/Yunsans/SUMMIT.http://www.sciencedirect.com/science/article/pii/S1569843225002717Synthetic Aperture RadarFoundation modelSelf-supervised auxiliary taskVision Transformer |
| spellingShingle | Yuntao Du Yushi Chen Lingbo Huang Yahu Yang Pedram Ghamisi Qian Du SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics International Journal of Applied Earth Observations and Geoinformation Synthetic Aperture Radar Foundation model Self-supervised auxiliary task Vision Transformer |
| title | SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| title_full | SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| title_fullStr | SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| title_full_unstemmed | SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| title_short | SUMMIT: A SAR foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| title_sort | summit a sar foundation model with multiple auxiliary tasks enhanced intrinsic characteristics |
| topic | Synthetic Aperture Radar Foundation model Self-supervised auxiliary task Vision Transformer |
| url | http://www.sciencedirect.com/science/article/pii/S1569843225002717 |
| work_keys_str_mv | AT yuntaodu summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics AT yushichen summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics AT lingbohuang summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics AT yahuyang summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics AT pedramghamisi summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics AT qiandu summitasarfoundationmodelwithmultipleauxiliarytasksenhancedintrinsiccharacteristics |