MSAM:Video Question Answering Based on Multi-Stage Attention Model
The video question answering (VideoQA) task requires understanding of semantic information of both the video and question to generate the answer.At present, it is difficult for VideoQA methods that are based on attention model to fully understand and accurately locate video information related to th...
Saved in:
| Main Authors: | LIANG Li-li, LIU Xin-yu, SUN Guang-lu, ZHU Su-xia |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Harbin University of Science and Technology Publications
2022-08-01
|
| Series: | Journal of Harbin University of Science and Technology |
| Subjects: | |
| Online Access: | https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=2123 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Deep Memory Fusion Model for Long Video Question Answering
by: SUN Guanglu, et al.
Published: (2021-02-01) -
A Semantic Weight Adaptive Model Based on Visual Question Answering
by: Li Huimin, et al.
Published: (2025-01-01) -
Multimodal representative answer extraction in community question answering
by: Ming Li, et al.
Published: (2023-10-01) -
Enhancing Visual Question Answering for Multiple Choice Questions
by: Rashi Goel, et al.
Published: (2025-01-01) -
An Image Grid Can Be Worth a Video: Zero-Shot Video Question Answering Using a VLM
by: Wonkyun Kim, et al.
Published: (2024-01-01)