Improving Visual Question Answering by Image Captioning
Visual Question Answering (VQA) is a challenging task that bridges the computer vision and natural language processing communities. It provide natural language answers to questions related to an associated image. Most existing VQA methods focus on the fusion and inference of visual features with the...
Saved in:
| Main Authors: | Xiangjun Shao, Hongsong Dong, Guangsheng Wu |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10918635/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Medical Knowledge-Based Differential Image Visual Question Answering
by: Fangpeng Lu, et al.
Published: (2025-01-01) -
Adaptive Conditional Reasoning for Remote Sensing Visual Question Answering
by: Yiqun Gao, et al.
Published: (2025-04-01) -
Visual Question Answering in Robotic Surgery: A Comprehensive Review
by: Di Ding, et al.
Published: (2025-01-01) -
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
by: Junkai Zhang, et al.
Published: (2025-04-01) -
Designing and Evaluating a Dual-Stream Transformer-Based Architecture for Visual Question Answering
by: Faheem Shehzad, et al.
Published: (2024-01-01)