Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data

Abstract In this study, as a proof-of-concept, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. We consider three perspectives: dataset construction, model design, and thorough evaluation, concluded as follows: (i), we contribute 4 multimodal datasets with 13M 2D im...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Hui Hui, Yanfeng Wang, Weidi Xie
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-08-01
Series:	Nature Communications
Online Access:	https://doi.org/10.1038/s41467-025-62385-7
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849226120009875456
author	Chaoyi Wu Xiaoman Zhang Ya Zhang Hui Hui Yanfeng Wang Weidi Xie
author_facet	Chaoyi Wu Xiaoman Zhang Ya Zhang Hui Hui Yanfeng Wang Weidi Xie
author_sort	Chaoyi Wu
collection	DOAJ
description	Abstract In this study, as a proof-of-concept, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. We consider three perspectives: dataset construction, model design, and thorough evaluation, concluded as follows: (i), we contribute 4 multimodal datasets with 13M 2D images and 615K 3D scans. When combined with a vast collection of existing datasets, this forms our training dataset, termed as Medical Multi-modal Dataset, MedMD. (ii), we propose an architecture that enables to integrate text input with 2D or 3D medical scans, and generates responses for diverse radiologic tasks, including diagnosis, visual question answering, report generation, and rationale diagnosis; (iii), beyond evaluation on 9 existing datasets, we propose a new benchmark, RadBench, comprising three tasks aiming to assess foundation models comprehensively. We conduct both automatic and human evaluations on RadBench. RadFM outperforms former accessible multi-modal foundation models, including GPT-4V. Additionally, we adapt RadFM for diverse public benchmarks, surpassing various existing SOTAs.
format	Article
id	doaj-art-257cc54748da4cf88f97428bd98bd46b
institution	Kabale University
issn	2041-1723
language	English
publishDate	2025-08-01
publisher	Nature Portfolio
record_format	Article
series	Nature Communications
spelling	doaj-art-257cc54748da4cf88f97428bd98bd46b2025-08-24T11:38:14ZengNature PortfolioNature Communications2041-17232025-08-0116112210.1038/s41467-025-62385-7Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical dataChaoyi Wu0Xiaoman Zhang1Ya Zhang2Hui Hui3Yanfeng Wang4Weidi Xie5Shanghai Jiao Tong UniversityShanghai Jiao Tong UniversityShanghai Jiao Tong UniversityShanghai Jiao Tong UniversityShanghai Jiao Tong UniversityShanghai Jiao Tong UniversityAbstract In this study, as a proof-of-concept, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. We consider three perspectives: dataset construction, model design, and thorough evaluation, concluded as follows: (i), we contribute 4 multimodal datasets with 13M 2D images and 615K 3D scans. When combined with a vast collection of existing datasets, this forms our training dataset, termed as Medical Multi-modal Dataset, MedMD. (ii), we propose an architecture that enables to integrate text input with 2D or 3D medical scans, and generates responses for diverse radiologic tasks, including diagnosis, visual question answering, report generation, and rationale diagnosis; (iii), beyond evaluation on 9 existing datasets, we propose a new benchmark, RadBench, comprising three tasks aiming to assess foundation models comprehensively. We conduct both automatic and human evaluations on RadBench. RadFM outperforms former accessible multi-modal foundation models, including GPT-4V. Additionally, we adapt RadFM for diverse public benchmarks, surpassing various existing SOTAs.https://doi.org/10.1038/s41467-025-62385-7
spellingShingle	Chaoyi Wu Xiaoman Zhang Ya Zhang Hui Hui Yanfeng Wang Weidi Xie Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data Nature Communications
title	Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
title_full	Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
title_fullStr	Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
title_full_unstemmed	Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
title_short	Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
title_sort	towards generalist foundation model for radiology by leveraging web scale 2d 3d medical data
url	https://doi.org/10.1038/s41467-025-62385-7
work_keys_str_mv	AT chaoyiwu towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata AT xiaomanzhang towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata AT yazhang towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata AT huihui towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata AT yanfengwang towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata AT weidixie towardsgeneralistfoundationmodelforradiologybyleveragingwebscale2d3dmedicaldata

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data

Similar Items