Training a high-performance retinal foundation model with half-the-data and 400 times less compute

Abstract Medical artificial intelligence is limited by available training datasets. Foundation models like RETFound from Moorfields Eye Hospital (MEH) can be adapted with small downstream datasets and thus alleviate this issue. RETFound-MEH used 900,000 training images. Recently, “data-efficient” DE...

Full description

Saved in:
Bibliographic Details
Main Authors: Justin Engelmann, Miguel O. Bernabeu
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-62123-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849764240157573120
author Justin Engelmann
Miguel O. Bernabeu
author_facet Justin Engelmann
Miguel O. Bernabeu
author_sort Justin Engelmann
collection DOAJ
description Abstract Medical artificial intelligence is limited by available training datasets. Foundation models like RETFound from Moorfields Eye Hospital (MEH) can be adapted with small downstream datasets and thus alleviate this issue. RETFound-MEH used 900,000 training images. Recently, “data-efficient” DERETFound achieved comparable performance with 150,000 images. Both require very substantial compute resources for training and use. We propose RETFound-Green trained on only 75,000 publicly available images with 400 times less compute using a novel Token Reconstruction objective. RETFound-MEH and DERETFound training costs are estimated at $10,000 and $14,000, respectively. RETFound-Green cost less than $100, with equally reduced environmental impact. RETFound-Green can be downloaded 14 times faster, computes vector embeddings 2.7 times faster which then require 2.6 times less storage space. On a variety of downstream tasks from geographically diverse datasets, RETFound-Green achieves more than twice as many statistically significant wins than the next best model.
format Article
id doaj-art-e8cba52d2dd84c35b7ac033af778a553
institution DOAJ
issn 2041-1723
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-e8cba52d2dd84c35b7ac033af778a5532025-08-20T03:05:10ZengNature PortfolioNature Communications2041-17232025-07-0116111510.1038/s41467-025-62123-zTraining a high-performance retinal foundation model with half-the-data and 400 times less computeJustin Engelmann0Miguel O. Bernabeu1Centre for Medical Informatics, Usher Institute, University of EdinburghCentre for Medical Informatics, Usher Institute, University of EdinburghAbstract Medical artificial intelligence is limited by available training datasets. Foundation models like RETFound from Moorfields Eye Hospital (MEH) can be adapted with small downstream datasets and thus alleviate this issue. RETFound-MEH used 900,000 training images. Recently, “data-efficient” DERETFound achieved comparable performance with 150,000 images. Both require very substantial compute resources for training and use. We propose RETFound-Green trained on only 75,000 publicly available images with 400 times less compute using a novel Token Reconstruction objective. RETFound-MEH and DERETFound training costs are estimated at $10,000 and $14,000, respectively. RETFound-Green cost less than $100, with equally reduced environmental impact. RETFound-Green can be downloaded 14 times faster, computes vector embeddings 2.7 times faster which then require 2.6 times less storage space. On a variety of downstream tasks from geographically diverse datasets, RETFound-Green achieves more than twice as many statistically significant wins than the next best model.https://doi.org/10.1038/s41467-025-62123-z
spellingShingle Justin Engelmann
Miguel O. Bernabeu
Training a high-performance retinal foundation model with half-the-data and 400 times less compute
Nature Communications
title Training a high-performance retinal foundation model with half-the-data and 400 times less compute
title_full Training a high-performance retinal foundation model with half-the-data and 400 times less compute
title_fullStr Training a high-performance retinal foundation model with half-the-data and 400 times less compute
title_full_unstemmed Training a high-performance retinal foundation model with half-the-data and 400 times less compute
title_short Training a high-performance retinal foundation model with half-the-data and 400 times less compute
title_sort training a high performance retinal foundation model with half the data and 400 times less compute
url https://doi.org/10.1038/s41467-025-62123-z
work_keys_str_mv AT justinengelmann trainingahighperformanceretinalfoundationmodelwithhalfthedataand400timeslesscompute
AT miguelobernabeu trainingahighperformanceretinalfoundationmodelwithhalfthedataand400timeslesscompute