Graph-Based Hierarchical Semantic Consistency Network for Remote Sensing Image–Text Retrieval
Remote sensing image-text retrieval (RSITR) is becoming increasingly essential for the efficient utilization of remote sensing (RS) data. Nevertheless, current approaches primarily focus on individual feature extraction strategies for visual and textual modalities. They often lack effective feature...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11031116/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Remote sensing image-text retrieval (RSITR) is becoming increasingly essential for the efficient utilization of remote sensing (RS) data. Nevertheless, current approaches primarily focus on individual feature extraction strategies for visual and textual modalities. They often lack effective feature aggregation strategies to fully leverage intramodal information integration and inter-modal information interactions, resulting in imprecise cross-modal feature alignment. In this article, we propose a novel graph-based hierarchical semantic consistency network, which enhances intramodal semantic associations through graph node communication and comprehensively explores the alignment of remote sensing images and texts by the designed Uni-modal Graph Aggregation (UGA) module and the Cross-modal Graph Aggregation (CGA) module. The UGA module adaptively integrates information with different semantic significance in each feature graph for accurate measurement of integral cross-modal semantic consistency. Furthermore, cross-modal information interactions are facilitated by the CGA module, which constructs cross-modal relevance graphs to infer the fine-grained cross-modal similarity. Extensive experiments on the RSICD and RSITMD datasets validate the superior performance of our model in the RSITR task. |
|---|---|
| ISSN: | 1939-1404 2151-1535 |