Medical Knowledge-Based Differential Image Visual Question Answering

Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, w...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fangpeng Lu, Songyan Liu, Wenbin Lu, Peng Chen, Boyang Ding
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Visual question answering medical knowledge graph multimodal
Online Access:	https://ieeexplore.ieee.org/document/10980296/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850131780694179840
author	Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding
author_facet	Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding
author_sort	Fangpeng Lu
collection	DOAJ
description	Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, whereas differential medical VQA can support multiple images and answer questions, including those about differences between images. However, these approaches primarily focus on extracting information from images while neglecting the inherent information and relevant relationships associated with the diseases themselves. Therefore, this paper proposes a differential medical visual question answering method based on medical knowledge, comprising three modules: the feature encoding module, the feature processing module, and the answer generation module. The proposed method first learns cluster embeddings of medical knowledge features through the feature encoding module, which are then interactively learned with image and text features. Following differential operations, these features are fed into the feature attention module, and subsequently into the answer generation module to produce the final answer. Experimental results demonstrate that our method significantly enhances the performance of the differential medical visual question answering task. This advancement is of considerable reference value in improving the applicability and interpretability of medical visual question answering.
format	Article
id	doaj-art-7331a009741d437384979edbf8d03a0e
institution	OA Journals
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-7331a009741d437384979edbf8d03a0e2025-08-20T02:32:22ZengIEEEIEEE Access2169-35362025-01-0113938189382910.1109/ACCESS.2025.356569510980296Medical Knowledge-Based Differential Image Visual Question AnsweringFangpeng Lu0https://orcid.org/0009-0006-3449-7424Songyan Liu1https://orcid.org/0009-0009-4602-6031Wenbin Lu2Peng Chen3Boyang Ding4School of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaVisual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, whereas differential medical VQA can support multiple images and answer questions, including those about differences between images. However, these approaches primarily focus on extracting information from images while neglecting the inherent information and relevant relationships associated with the diseases themselves. Therefore, this paper proposes a differential medical visual question answering method based on medical knowledge, comprising three modules: the feature encoding module, the feature processing module, and the answer generation module. The proposed method first learns cluster embeddings of medical knowledge features through the feature encoding module, which are then interactively learned with image and text features. Following differential operations, these features are fed into the feature attention module, and subsequently into the answer generation module to produce the final answer. Experimental results demonstrate that our method significantly enhances the performance of the differential medical visual question answering task. This advancement is of considerable reference value in improving the applicability and interpretability of medical visual question answering.https://ieeexplore.ieee.org/document/10980296/Visual question answeringmedical knowledge graphmultimodal
spellingShingle	Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding Medical Knowledge-Based Differential Image Visual Question Answering IEEE Access Visual question answering medical knowledge graph multimodal
title	Medical Knowledge-Based Differential Image Visual Question Answering
title_full	Medical Knowledge-Based Differential Image Visual Question Answering
title_fullStr	Medical Knowledge-Based Differential Image Visual Question Answering
title_full_unstemmed	Medical Knowledge-Based Differential Image Visual Question Answering
title_short	Medical Knowledge-Based Differential Image Visual Question Answering
title_sort	medical knowledge based differential image visual question answering
topic	Visual question answering medical knowledge graph multimodal
url	https://ieeexplore.ieee.org/document/10980296/
work_keys_str_mv	AT fangpenglu medicalknowledgebaseddifferentialimagevisualquestionanswering AT songyanliu medicalknowledgebaseddifferentialimagevisualquestionanswering AT wenbinlu medicalknowledgebaseddifferentialimagevisualquestionanswering AT pengchen medicalknowledgebaseddifferentialimagevisualquestionanswering AT boyangding medicalknowledgebaseddifferentialimagevisualquestionanswering

Medical Knowledge-Based Differential Image Visual Question Answering

Similar Items