Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification
In remote sensing applications, autonomous aerial vehicles (AAVs) overcome the limitations of single-sensor approaches by integrating multiple sensors and fusing cross-modal data, significantly improving target classification accuracy. However, during the process of multimodal learning, the effectiv...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11071999/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849706363776663552 |
|---|---|
| author | Shihao Wang Zhengwei Xu Yun Lin |
| author_facet | Shihao Wang Zhengwei Xu Yun Lin |
| author_sort | Shihao Wang |
| collection | DOAJ |
| description | In remote sensing applications, autonomous aerial vehicles (AAVs) overcome the limitations of single-sensor approaches by integrating multiple sensors and fusing cross-modal data, significantly improving target classification accuracy. However, during the process of multimodal learning, the effectiveness of fusion is severely affected by modality imbalance caused by inconsistent gradient directions when integrating heterogeneous information. Existing methods predominantly focus on parameter tuning and gradient modulation, failing to resolve inherent conflicts from divergent modality optimization trajectories. To address these limitations, we propose a gradient-criterion multistage training (GCMT) framework, which systematically resolves gradient conflicts through an alternating freezing strategy, optimizing unimodal branches by evaluating consistency between unimodal and multimodal gradient directions. Building on the GCMT, we further introduce an information entropy measurement fusion (IEMF) module, which dynamically adjusts cross-modal feature fusion weights using entropy-based metrics to mitigate overreliance on dominant modalities while preserving synergistic interactions. We build a multimodal dataset of signals and images based on the UAV platform, and extensive experiments are implemented on both our self-constructed and public datasets. The results not only demonstrate a significant improvement in the performance of our GCMT compared to state-of-the-art methods, but also validate the efficacy of GCMT in harmonizing gradient alignment and of IEMF in enabling balanced multimodal fusion. |
| format | Article |
| id | doaj-art-4978b44bfdbd4fcaa455901aa8bb44b7 |
| institution | DOAJ |
| issn | 1939-1404 2151-1535 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| spelling | doaj-art-4978b44bfdbd4fcaa455901aa8bb44b72025-08-20T03:16:12ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118182401825010.1109/JSTARS.2025.358613211071999Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing ClassificationShihao Wang0https://orcid.org/0009-0000-4515-0959Zhengwei Xu1https://orcid.org/0000-0002-2645-9915Yun Lin2https://orcid.org/0000-0002-4002-1282Key Laboratory of Advanced Ship Communication and Information Technology, Ministry of Industry and Information Technology, College of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaKey Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, Xinxiang, ChinaKey Laboratory of Advanced Ship Communication and Information Technology, Ministry of Industry and Information Technology, College of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaIn remote sensing applications, autonomous aerial vehicles (AAVs) overcome the limitations of single-sensor approaches by integrating multiple sensors and fusing cross-modal data, significantly improving target classification accuracy. However, during the process of multimodal learning, the effectiveness of fusion is severely affected by modality imbalance caused by inconsistent gradient directions when integrating heterogeneous information. Existing methods predominantly focus on parameter tuning and gradient modulation, failing to resolve inherent conflicts from divergent modality optimization trajectories. To address these limitations, we propose a gradient-criterion multistage training (GCMT) framework, which systematically resolves gradient conflicts through an alternating freezing strategy, optimizing unimodal branches by evaluating consistency between unimodal and multimodal gradient directions. Building on the GCMT, we further introduce an information entropy measurement fusion (IEMF) module, which dynamically adjusts cross-modal feature fusion weights using entropy-based metrics to mitigate overreliance on dominant modalities while preserving synergistic interactions. We build a multimodal dataset of signals and images based on the UAV platform, and extensive experiments are implemented on both our self-constructed and public datasets. The results not only demonstrate a significant improvement in the performance of our GCMT compared to state-of-the-art methods, but also validate the efficacy of GCMT in harmonizing gradient alignment and of IEMF in enabling balanced multimodal fusion.https://ieeexplore.ieee.org/document/11071999/Gradient conflictmodal fusionmodality imbalancemultimodal learning |
| spellingShingle | Shihao Wang Zhengwei Xu Yun Lin Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Gradient conflict modal fusion modality imbalance multimodal learning |
| title | Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification |
| title_full | Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification |
| title_fullStr | Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification |
| title_full_unstemmed | Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification |
| title_short | Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification |
| title_sort | multistage training and fusion method for imbalanced multimodal uav remote sensing classification |
| topic | Gradient conflict modal fusion modality imbalance multimodal learning |
| url | https://ieeexplore.ieee.org/document/11071999/ |
| work_keys_str_mv | AT shihaowang multistagetrainingandfusionmethodforimbalancedmultimodaluavremotesensingclassification AT zhengweixu multistagetrainingandfusionmethodforimbalancedmultimodaluavremotesensingclassification AT yunlin multistagetrainingandfusionmethodforimbalancedmultimodaluavremotesensingclassification |