CLIP-Based Grid Features and Masking for Remote Sensing Image Captioning
Remote sensing image (RSI) captioning is a vision-language multimodal task that aims to describe image content in natural language, facilitating accurate and convenient comprehension of RSIs. Existing methods primarily focus on extracting visual features using vision-task pretraining models, such as...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10806569/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|