A multimodal multidomain multilingual medical foundation model for zero shot clinical diagnosis

Abstract Radiology images are one of the most commonly used in daily clinical diagnosis. Typically, clinical diagnosis using radiology images involves disease reporting and classification, where the former is a multimodal task whereby textual reports are generated to describe clinical findings in im...

Full description

Saved in:
Bibliographic Details
Main Authors: Fenglin Liu, Zheng Li, Qingyu Yin, Jinfa Huang, Jiebo Luo, Anshul Thakur, Kim Branson, Patrick Schwab, Bing Yin, Xian Wu, Yefeng Zheng, David A. Clifton
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-024-01339-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Radiology images are one of the most commonly used in daily clinical diagnosis. Typically, clinical diagnosis using radiology images involves disease reporting and classification, where the former is a multimodal task whereby textual reports are generated to describe clinical findings in images, as are common in various domains, e.g., chest X-ray or computed tomography. Existing approaches are mainly supervised, the quality of which heavily depends on the volume and quality of available labeled data. However, for rarer or more novel diseases, enrolling patients to collect data is both time-consuming and expensive. For non-English languages, sufficient quantities of labeled data are typically not available. We propose the Multimodal Multidomain Multilingual Foundation Model. It is useful for rare diseases and non-English languages, where the labeled data are frequently much more scarce, and may even be absent. Our approach achieves encouraging performances on nine datasets, including 2 infectious and 14 non-infectious diseases.
ISSN:2398-6352