Medical Knowledge-Based Differential Image Visual Question Answering

Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, w...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fangpeng Lu, Songyan Liu, Wenbin Lu, Peng Chen, Boyang Ding
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Visual question answering medical knowledge graph multimodal
Online Access:	https://ieeexplore.ieee.org/document/10980296/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, whereas differential medical VQA can support multiple images and answer questions, including those about differences between images. However, these approaches primarily focus on extracting information from images while neglecting the inherent information and relevant relationships associated with the diseases themselves. Therefore, this paper proposes a differential medical visual question answering method based on medical knowledge, comprising three modules: the feature encoding module, the feature processing module, and the answer generation module. The proposed method first learns cluster embeddings of medical knowledge features through the feature encoding module, which are then interactively learned with image and text features. Following differential operations, these features are fed into the feature attention module, and subsequently into the answer generation module to produce the final answer. Experimental results demonstrate that our method significantly enhances the performance of the differential medical visual question answering task. This advancement is of considerable reference value in improving the applicability and interpretability of medical visual question answering.
ISSN:	2169-3536

Medical Knowledge-Based Differential Image Visual Question Answering

Similar Items