Discriminative, generative artificial intelligence, and foundation models in retina imaging
Recent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the tr...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wolters Kluwer Medknow Publications
2024-12-01
|
| Series: | Taiwan Journal of Ophthalmology |
| Subjects: | |
| Online Access: | https://journals.lww.com/10.4103/tjo.TJO-D-24-00064 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850095095843389440 |
|---|---|
| author | Paisan Ruamviboonsuk Niracha Arjkongharn Nattaporn Vongsa Pawin Pakaymaskul Natsuda Kaothanthong |
| author_facet | Paisan Ruamviboonsuk Niracha Arjkongharn Nattaporn Vongsa Pawin Pakaymaskul Natsuda Kaothanthong |
| author_sort | Paisan Ruamviboonsuk |
| collection | DOAJ |
| description | Recent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the transformer architecture in natural language processing, has emerged as useful techniques for discriminating retinal images. ViT can attain excellent results when pretrained at sufficient scale and transferred to specific tasks with fewer images, compared to conventional CNN. Many studies found better performance of ViT, compared to CNN, for common tasks such as diabetic retinopathy screening on color fundus photographs (CFP) and segmentation of retinal fluid on optical coherence tomography (OCT) images. Generative Adversarial Network (GAN) is the main AI technique in generative AI in retinal imaging. Novel images generated by GAN can be applied for training AI models in imbalanced or inadequate datasets. Foundation models are also recent advances in retinal imaging. They are pretrained with huge datasets, such as millions of CFP and OCT images and fine-tuned for downstream tasks with much smaller datasets. A foundation model, RETFound, which was self-supervised and found to discriminate many eye and systemic diseases better than supervised models. Large language models are foundation models that may be applied for text-related tasks, like reports of retinal angiography. Whereas AI technology moves forward fast, real-world use of AI models moves slowly, making the gap between development and deployment even wider. Strong evidence showing AI models can prevent visual loss may be required to close this gap. |
| format | Article |
| id | doaj-art-76cee076f2234724944eecb8914e32a4 |
| institution | DOAJ |
| issn | 2211-5056 2211-5072 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Wolters Kluwer Medknow Publications |
| record_format | Article |
| series | Taiwan Journal of Ophthalmology |
| spelling | doaj-art-76cee076f2234724944eecb8914e32a42025-08-20T02:41:31ZengWolters Kluwer Medknow PublicationsTaiwan Journal of Ophthalmology2211-50562211-50722024-12-0114447348510.4103/tjo.TJO-D-24-00064Discriminative, generative artificial intelligence, and foundation models in retina imagingPaisan RuamviboonsukNiracha ArjkongharnNattaporn VongsaPawin PakaymaskulNatsuda KaothanthongRecent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the transformer architecture in natural language processing, has emerged as useful techniques for discriminating retinal images. ViT can attain excellent results when pretrained at sufficient scale and transferred to specific tasks with fewer images, compared to conventional CNN. Many studies found better performance of ViT, compared to CNN, for common tasks such as diabetic retinopathy screening on color fundus photographs (CFP) and segmentation of retinal fluid on optical coherence tomography (OCT) images. Generative Adversarial Network (GAN) is the main AI technique in generative AI in retinal imaging. Novel images generated by GAN can be applied for training AI models in imbalanced or inadequate datasets. Foundation models are also recent advances in retinal imaging. They are pretrained with huge datasets, such as millions of CFP and OCT images and fine-tuned for downstream tasks with much smaller datasets. A foundation model, RETFound, which was self-supervised and found to discriminate many eye and systemic diseases better than supervised models. Large language models are foundation models that may be applied for text-related tasks, like reports of retinal angiography. Whereas AI technology moves forward fast, real-world use of AI models moves slowly, making the gap between development and deployment even wider. Strong evidence showing AI models can prevent visual loss may be required to close this gap.https://journals.lww.com/10.4103/tjo.TJO-D-24-00064discriminative artificial intelligencefoundation modelsgenerative artificial intelligenceretinal imagingvision transformer |
| spellingShingle | Paisan Ruamviboonsuk Niracha Arjkongharn Nattaporn Vongsa Pawin Pakaymaskul Natsuda Kaothanthong Discriminative, generative artificial intelligence, and foundation models in retina imaging Taiwan Journal of Ophthalmology discriminative artificial intelligence foundation models generative artificial intelligence retinal imaging vision transformer |
| title | Discriminative, generative artificial intelligence, and foundation models in retina imaging |
| title_full | Discriminative, generative artificial intelligence, and foundation models in retina imaging |
| title_fullStr | Discriminative, generative artificial intelligence, and foundation models in retina imaging |
| title_full_unstemmed | Discriminative, generative artificial intelligence, and foundation models in retina imaging |
| title_short | Discriminative, generative artificial intelligence, and foundation models in retina imaging |
| title_sort | discriminative generative artificial intelligence and foundation models in retina imaging |
| topic | discriminative artificial intelligence foundation models generative artificial intelligence retinal imaging vision transformer |
| url | https://journals.lww.com/10.4103/tjo.TJO-D-24-00064 |
| work_keys_str_mv | AT paisanruamviboonsuk discriminativegenerativeartificialintelligenceandfoundationmodelsinretinaimaging AT nirachaarjkongharn discriminativegenerativeartificialintelligenceandfoundationmodelsinretinaimaging AT nattapornvongsa discriminativegenerativeartificialintelligenceandfoundationmodelsinretinaimaging AT pawinpakaymaskul discriminativegenerativeartificialintelligenceandfoundationmodelsinretinaimaging AT natsudakaothanthong discriminativegenerativeartificialintelligenceandfoundationmodelsinretinaimaging |