Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities

IntroductionThe integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniqu...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdullah Hosseini, Ahmed Serag
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2024.1454441/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576070194298880
author Abdullah Hosseini
Ahmed Serag
author_facet Abdullah Hosseini
Ahmed Serag
author_sort Abdullah Hosseini
collection DOAJ
description IntroductionThe integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniques in artificial intelligence offer innovative solutions to overcome these constraints, with synthetic data generation enabling the creation of realistic medical imaging datasets, but the preservation of critical hidden medical biomarkers is an open question.MethodsThis study employs state-of-the-art Denoising Diffusion Probabilistic Models integrated with a Swin-transformer-based network to generate synthetic medical data. Three distinct areas of medical imaging - radiology, ophthalmology, and histopathology - are explored. The quality of synthetic images is evaluated through a classifier trained to identify the preservation of medical biomarkers.ResultsThe diffusion model effectively preserves key medical features, such as lung markings and retinal abnormalities, producing synthetic images closely resembling real data. Classifier performance demonstrates the reliability of synthetic data for downstream tasks, with F1 and AUC reaching 0.8–0.99.DiscussionThis work provides valuable insights into the potential of diffusion-based models for generating realistic, biomarker-preserving synthetic images across various medical imaging modalities. These findings highlight the potential of synthetic data to address challenges such as data scarcity and privacy concerns in clinical practice, research, and education.
format Article
id doaj-art-c2addfde69ab40c4a7b46d0156fdfe84
institution Kabale University
issn 2624-8212
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj-art-c2addfde69ab40c4a7b46d0156fdfe842025-01-31T11:48:01ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-01-01710.3389/frai.2024.14544411454441Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalitiesAbdullah HosseiniAhmed SeragIntroductionThe integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniques in artificial intelligence offer innovative solutions to overcome these constraints, with synthetic data generation enabling the creation of realistic medical imaging datasets, but the preservation of critical hidden medical biomarkers is an open question.MethodsThis study employs state-of-the-art Denoising Diffusion Probabilistic Models integrated with a Swin-transformer-based network to generate synthetic medical data. Three distinct areas of medical imaging - radiology, ophthalmology, and histopathology - are explored. The quality of synthetic images is evaluated through a classifier trained to identify the preservation of medical biomarkers.ResultsThe diffusion model effectively preserves key medical features, such as lung markings and retinal abnormalities, producing synthetic images closely resembling real data. Classifier performance demonstrates the reliability of synthetic data for downstream tasks, with F1 and AUC reaching 0.8–0.99.DiscussionThis work provides valuable insights into the potential of diffusion-based models for generating realistic, biomarker-preserving synthetic images across various medical imaging modalities. These findings highlight the potential of synthetic data to address challenges such as data scarcity and privacy concerns in clinical practice, research, and education.https://www.frontiersin.org/articles/10.3389/frai.2024.1454441/fullsynthetic data generationclinical biomarkersdenoising diffusion modelsmedical imagingSwin-transformer network
spellingShingle Abdullah Hosseini
Ahmed Serag
Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
Frontiers in Artificial Intelligence
synthetic data generation
clinical biomarkers
denoising diffusion models
medical imaging
Swin-transformer network
title Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
title_full Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
title_fullStr Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
title_full_unstemmed Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
title_short Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities
title_sort is synthetic data generation effective in maintaining clinical biomarkers investigating diffusion models across diverse imaging modalities
topic synthetic data generation
clinical biomarkers
denoising diffusion models
medical imaging
Swin-transformer network
url https://www.frontiersin.org/articles/10.3389/frai.2024.1454441/full
work_keys_str_mv AT abdullahhosseini issyntheticdatagenerationeffectiveinmaintainingclinicalbiomarkersinvestigatingdiffusionmodelsacrossdiverseimagingmodalities
AT ahmedserag issyntheticdatagenerationeffectiveinmaintainingclinicalbiomarkersinvestigatingdiffusionmodelsacrossdiverseimagingmodalities