Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study

BackgroundArtificial patient technology could transform health care by accelerating diagnosis, treatment, and mapping clinical pathways. Deep learning methods for generating artificial data in health care include data augmentation by variational autoencoders (VAE) technology....

Full description

Saved in:

Bibliographic Details
Main Authors:	Fabrice Ferré, Stéphanie Allassonnière, Clément Chadebec, Vincent Minville
Format:	Article
Language:	English
Published:	JMIR Publications 2025-04-01
Series:	Journal of Medical Internet Research
Online Access:	https://www.jmir.org/2025/1/e63130
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850147169606041600
author	Fabrice Ferré Stéphanie Allassonnière Clément Chadebec Vincent Minville
author_facet	Fabrice Ferré Stéphanie Allassonnière Clément Chadebec Vincent Minville
author_sort	Fabrice Ferré
collection	DOAJ
description	BackgroundArtificial patient technology could transform health care by accelerating diagnosis, treatment, and mapping clinical pathways. Deep learning methods for generating artificial data in health care include data augmentation by variational autoencoders (VAE) technology. ObjectiveWe aimed to test the feasibility of generating artificial patients with reliable clinical characteristics by using a geometry-based VAE applied, for the first time, on high-dimension, low-sample-size tabular data. MethodsClinical tabular data were extracted from 521 real patients of the “MAX” digital conversational agent (BOTdesign) created for preparing patients for anesthesia. A 3-stage methodological approach was implemented to generate up to 10,000 artificial patients: training the model and generating artificial data, assessing the consistency and confidentiality of artificial data, and validating the plausibility of the newly created artificial patients. ResultsWe demonstrated the feasibility of applying the VAE technique to tabular data to generate large artificial patient cohorts with high consistency (fidelity scores>94%). Moreover, artificial patients could not be matched with real patients (filter similarity scores>99%, κ coefficients of agreement<0.2), thus guaranteeing the essential ethical concern of confidentiality. ConclusionsThis proof-of-concept study has demonstrated our ability to augment real tabular data to generate artificial patients. These promising results make it possible to envisage in silico trials carried out on large cohorts of artificial patients, thereby overcoming the pitfalls usually encountered in in vivo trials. Further studies integrating longitudinal dynamics are needed to map patient trajectories.
format	Article
id	doaj-art-4ff1a2ab47304516902586fa41b5ed5d
institution	OA Journals
issn	1438-8871
language	English
publishDate	2025-04-01
publisher	JMIR Publications
record_format	Article
series	Journal of Medical Internet Research
spelling	doaj-art-4ff1a2ab47304516902586fa41b5ed5d2025-08-20T02:27:39ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-04-0127e6313010.2196/63130Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility StudyFabrice Ferréhttps://orcid.org/0000-0001-6648-7454Stéphanie Allassonnièrehttps://orcid.org/0000-0002-5692-4945Clément Chadebechttps://orcid.org/0000-0003-3890-1392Vincent Minvillehttps://orcid.org/0000-0003-0516-4939 BackgroundArtificial patient technology could transform health care by accelerating diagnosis, treatment, and mapping clinical pathways. Deep learning methods for generating artificial data in health care include data augmentation by variational autoencoders (VAE) technology. ObjectiveWe aimed to test the feasibility of generating artificial patients with reliable clinical characteristics by using a geometry-based VAE applied, for the first time, on high-dimension, low-sample-size tabular data. MethodsClinical tabular data were extracted from 521 real patients of the “MAX” digital conversational agent (BOTdesign) created for preparing patients for anesthesia. A 3-stage methodological approach was implemented to generate up to 10,000 artificial patients: training the model and generating artificial data, assessing the consistency and confidentiality of artificial data, and validating the plausibility of the newly created artificial patients. ResultsWe demonstrated the feasibility of applying the VAE technique to tabular data to generate large artificial patient cohorts with high consistency (fidelity scores>94%). Moreover, artificial patients could not be matched with real patients (filter similarity scores>99%, κ coefficients of agreement<0.2), thus guaranteeing the essential ethical concern of confidentiality. ConclusionsThis proof-of-concept study has demonstrated our ability to augment real tabular data to generate artificial patients. These promising results make it possible to envisage in silico trials carried out on large cohorts of artificial patients, thereby overcoming the pitfalls usually encountered in in vivo trials. Further studies integrating longitudinal dynamics are needed to map patient trajectories.https://www.jmir.org/2025/1/e63130
spellingShingle	Fabrice Ferré Stéphanie Allassonnière Clément Chadebec Vincent Minville Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study Journal of Medical Internet Research
title	Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
title_full	Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
title_fullStr	Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
title_full_unstemmed	Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
title_short	Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
title_sort	generating artificial patients with reliable clinical characteristics using a geometry based variational autoencoder proof of concept feasibility study
url	https://www.jmir.org/2025/1/e63130
work_keys_str_mv	AT fabriceferre generatingartificialpatientswithreliableclinicalcharacteristicsusingageometrybasedvariationalautoencoderproofofconceptfeasibilitystudy AT stephanieallassonniere generatingartificialpatientswithreliableclinicalcharacteristicsusingageometrybasedvariationalautoencoderproofofconceptfeasibilitystudy AT clementchadebec generatingartificialpatientswithreliableclinicalcharacteristicsusingageometrybasedvariationalautoencoderproofofconceptfeasibilitystudy AT vincentminville generatingartificialpatientswithreliableclinicalcharacteristicsusingageometrybasedvariationalautoencoderproofofconceptfeasibilitystudy

Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study

Similar Items