Medical Knowledge-Based Differential Image Visual Question Answering
Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, w...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10980296/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850131780694179840 |
|---|---|
| author | Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding |
| author_facet | Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding |
| author_sort | Fangpeng Lu |
| collection | DOAJ |
| description | Visual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, whereas differential medical VQA can support multiple images and answer questions, including those about differences between images. However, these approaches primarily focus on extracting information from images while neglecting the inherent information and relevant relationships associated with the diseases themselves. Therefore, this paper proposes a differential medical visual question answering method based on medical knowledge, comprising three modules: the feature encoding module, the feature processing module, and the answer generation module. The proposed method first learns cluster embeddings of medical knowledge features through the feature encoding module, which are then interactively learned with image and text features. Following differential operations, these features are fed into the feature attention module, and subsequently into the answer generation module to produce the final answer. Experimental results demonstrate that our method significantly enhances the performance of the differential medical visual question answering task. This advancement is of considerable reference value in improving the applicability and interpretability of medical visual question answering. |
| format | Article |
| id | doaj-art-7331a009741d437384979edbf8d03a0e |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-7331a009741d437384979edbf8d03a0e2025-08-20T02:32:22ZengIEEEIEEE Access2169-35362025-01-0113938189382910.1109/ACCESS.2025.356569510980296Medical Knowledge-Based Differential Image Visual Question AnsweringFangpeng Lu0https://orcid.org/0009-0006-3449-7424Songyan Liu1https://orcid.org/0009-0009-4602-6031Wenbin Lu2Peng Chen3Boyang Ding4School of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaSchool of Electronic Engineering, Heilongjiang University, Harbin, Heilongjiang, ChinaVisual Question Answering (VQA) technology shows great promise for cross-disciplinary applications, with its integration into the medical field emerging as a major research focus in recent years. The current mainstream medical visual question answering (VQA) models only support single-image input, whereas differential medical VQA can support multiple images and answer questions, including those about differences between images. However, these approaches primarily focus on extracting information from images while neglecting the inherent information and relevant relationships associated with the diseases themselves. Therefore, this paper proposes a differential medical visual question answering method based on medical knowledge, comprising three modules: the feature encoding module, the feature processing module, and the answer generation module. The proposed method first learns cluster embeddings of medical knowledge features through the feature encoding module, which are then interactively learned with image and text features. Following differential operations, these features are fed into the feature attention module, and subsequently into the answer generation module to produce the final answer. Experimental results demonstrate that our method significantly enhances the performance of the differential medical visual question answering task. This advancement is of considerable reference value in improving the applicability and interpretability of medical visual question answering.https://ieeexplore.ieee.org/document/10980296/Visual question answeringmedical knowledge graphmultimodal |
| spellingShingle | Fangpeng Lu Songyan Liu Wenbin Lu Peng Chen Boyang Ding Medical Knowledge-Based Differential Image Visual Question Answering IEEE Access Visual question answering medical knowledge graph multimodal |
| title | Medical Knowledge-Based Differential Image Visual Question Answering |
| title_full | Medical Knowledge-Based Differential Image Visual Question Answering |
| title_fullStr | Medical Knowledge-Based Differential Image Visual Question Answering |
| title_full_unstemmed | Medical Knowledge-Based Differential Image Visual Question Answering |
| title_short | Medical Knowledge-Based Differential Image Visual Question Answering |
| title_sort | medical knowledge based differential image visual question answering |
| topic | Visual question answering medical knowledge graph multimodal |
| url | https://ieeexplore.ieee.org/document/10980296/ |
| work_keys_str_mv | AT fangpenglu medicalknowledgebaseddifferentialimagevisualquestionanswering AT songyanliu medicalknowledgebaseddifferentialimagevisualquestionanswering AT wenbinlu medicalknowledgebaseddifferentialimagevisualquestionanswering AT pengchen medicalknowledgebaseddifferentialimagevisualquestionanswering AT boyangding medicalknowledgebaseddifferentialimagevisualquestionanswering |