Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data
Remote sensing (RS) imagery is important for applications ranging from land cover and land use (LCLU) mapping to agriculture and forest monitoring. However, there is a limited availability of high-quality labeled data to use as a reference to train supervised learning (SL) models. Semi-supervised le...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Copernicus Publications
2025-05-01
|
| Series: | The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| Online Access: | https://isprs-archives.copernicus.org/articles/XLVIII-M-7-2025/21/2025/isprs-archives-XLVIII-M-7-2025-21-2025.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850270932233355264 |
|---|---|
| author | I. Hernandez-Sequeira D. Ibanez R. Fernandez-Beltran F. Pla |
| author_facet | I. Hernandez-Sequeira D. Ibanez R. Fernandez-Beltran F. Pla |
| author_sort | I. Hernandez-Sequeira |
| collection | DOAJ |
| description | Remote sensing (RS) imagery is important for applications ranging from land cover and land use (LCLU) mapping to agriculture and forest monitoring. However, there is a limited availability of high-quality labeled data to use as a reference to train supervised learning (SL) models. Semi-supervised learning (SSL) frameworks, such as UniMatch (Yang et al., 2023), use pseudo-labeling and consistency regularization methods to address this limitation. Similar works have been adapted to RS: LSST (Lu et al., 2022) refines pseudo-labels with adaptive class-specific thresholds, while RS-DWL (Huang et al., 2024) mitigates noise and class imbalance through decoupled learning and confidence-based weighting. Despite these advances, SSL applications to multimodal RS imagery remain underexplored. We address this gap by adapting the SSL framework UniMatch to incorporate diverse encoders and multimodal remote sensing data for LCLU segmentation. We experimented on FLAIR-2 (Garioud et al., 2023), a dataset that combines very high-resolution aerial imagery (RGB) with near-infrared (NIR) data and elevation measurements (above-ground height). Key findings reveal that we achieved the best segmentation results using a transformer encoder for SL and SSL scenarios. When comparing RGB-only data and multimodal data, we observed that some classes, like “buildings”, “water”, and “coniferous”, benefited from the inclusion of NIR and elevation information. In the semi-supervised experiments, where only half of the data was labeled, and the remaining half was used as unlabeled (simulating a real-world scenario), the multimodal SSL approach outperformed the fully supervised learning (FSL) approach using only the labeled subset (1/2). These results highlight the strong potential of data fusion in RS applications with limited labeled data. |
| format | Article |
| id | doaj-art-9de9e8d415654452b4a4d4ac9d735043 |
| institution | OA Journals |
| issn | 1682-1750 2194-9034 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Copernicus Publications |
| record_format | Article |
| series | The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| spelling | doaj-art-9de9e8d415654452b4a4d4ac9d7350432025-08-20T01:52:23ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342025-05-01XLVIII-M-7-2025212810.5194/isprs-archives-XLVIII-M-7-2025-21-2025Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing DataI. Hernandez-Sequeira0D. Ibanez1R. Fernandez-Beltran2F. Pla3Institute of New Imaging Technologies, University Jaume I, 12071 Castellón de la Plana, SpainInstitute of New Imaging Technologies, University Jaume I, 12071 Castellón de la Plana, SpainDept. of Computer Science and Systems, University of Murcia, 30100 Murcia, SpainInstitute of New Imaging Technologies, University Jaume I, 12071 Castellón de la Plana, SpainRemote sensing (RS) imagery is important for applications ranging from land cover and land use (LCLU) mapping to agriculture and forest monitoring. However, there is a limited availability of high-quality labeled data to use as a reference to train supervised learning (SL) models. Semi-supervised learning (SSL) frameworks, such as UniMatch (Yang et al., 2023), use pseudo-labeling and consistency regularization methods to address this limitation. Similar works have been adapted to RS: LSST (Lu et al., 2022) refines pseudo-labels with adaptive class-specific thresholds, while RS-DWL (Huang et al., 2024) mitigates noise and class imbalance through decoupled learning and confidence-based weighting. Despite these advances, SSL applications to multimodal RS imagery remain underexplored. We address this gap by adapting the SSL framework UniMatch to incorporate diverse encoders and multimodal remote sensing data for LCLU segmentation. We experimented on FLAIR-2 (Garioud et al., 2023), a dataset that combines very high-resolution aerial imagery (RGB) with near-infrared (NIR) data and elevation measurements (above-ground height). Key findings reveal that we achieved the best segmentation results using a transformer encoder for SL and SSL scenarios. When comparing RGB-only data and multimodal data, we observed that some classes, like “buildings”, “water”, and “coniferous”, benefited from the inclusion of NIR and elevation information. In the semi-supervised experiments, where only half of the data was labeled, and the remaining half was used as unlabeled (simulating a real-world scenario), the multimodal SSL approach outperformed the fully supervised learning (FSL) approach using only the labeled subset (1/2). These results highlight the strong potential of data fusion in RS applications with limited labeled data.https://isprs-archives.copernicus.org/articles/XLVIII-M-7-2025/21/2025/isprs-archives-XLVIII-M-7-2025-21-2025.pdf |
| spellingShingle | I. Hernandez-Sequeira D. Ibanez R. Fernandez-Beltran F. Pla Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| title | Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data |
| title_full | Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data |
| title_fullStr | Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data |
| title_full_unstemmed | Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data |
| title_short | Adapting Semi-Supervised Segmentation methods to Multimodal Remote Sensing Data |
| title_sort | adapting semi supervised segmentation methods to multimodal remote sensing data |
| url | https://isprs-archives.copernicus.org/articles/XLVIII-M-7-2025/21/2025/isprs-archives-XLVIII-M-7-2025-21-2025.pdf |
| work_keys_str_mv | AT ihernandezsequeira adaptingsemisupervisedsegmentationmethodstomultimodalremotesensingdata AT dibanez adaptingsemisupervisedsegmentationmethodstomultimodalremotesensingdata AT rfernandezbeltran adaptingsemisupervisedsegmentationmethodstomultimodalremotesensingdata AT fpla adaptingsemisupervisedsegmentationmethodstomultimodalremotesensingdata |