Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning

Traditionally, medical research is based on randomized controlled trials (RCTs) for interventions such as drugs and operative procedures. However, increasingly, there is a need for health research to evolve. RCTs are expensive to run, are generally formulated with a single research questi...

Full description

Saved in:
Bibliographic Details
Main Authors: Jodie A Austin, Elton H Lobo, Mahnaz Samadbeik, Teyl Engstrom, Reji Philip, Jason D Pole, Clair M Sullivan
Format: Article
Language:English
Published: JMIR Publications 2024-12-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2024/1/e58637
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850101079587422208
author Jodie A Austin
Elton H Lobo
Mahnaz Samadbeik
Teyl Engstrom
Reji Philip
Jason D Pole
Clair M Sullivan
author_facet Jodie A Austin
Elton H Lobo
Mahnaz Samadbeik
Teyl Engstrom
Reji Philip
Jason D Pole
Clair M Sullivan
author_sort Jodie A Austin
collection DOAJ
description Traditionally, medical research is based on randomized controlled trials (RCTs) for interventions such as drugs and operative procedures. However, increasingly, there is a need for health research to evolve. RCTs are expensive to run, are generally formulated with a single research question in mind, and analyze a limited dataset for a restricted period. Progressively, health decision makers are focusing on real-world data (RWD) to deliver large-scale longitudinal insights that are actionable. RWD are collected as part of routine care in real time using digital health infrastructure. For example, understanding the effectiveness of an intervention could be enhanced by combining evidence from RCTs with RWD, providing insights into long-term outcomes in real-life situations. Clinicians and researchers struggle in the digital era to harness RWD for digital health research in an efficient and ethically and morally appropriate manner. This struggle encompasses challenges such as ensuring data quality, integrating diverse sources, establishing governance policies, ensuring regulatory compliance, developing analytical capabilities, and translating insights into actionable strategies. The same way that drug trials require infrastructure to support their conduct, digital health also necessitates new and disruptive research data infrastructure. Novel methods such as common data models, federated learning, and synthetic data generation are emerging to enhance the utility of research using RWD, which are often siloed across health systems. A continued focus on data privacy and ethical compliance remains. The past 25 years have seen a notable shift from an emphasis on RCTs as the only source of practice-guiding clinical evidence to the inclusion of modern-day methods harnessing RWD. This paper describes the evolution of synthetic data, common data models, and federated learning supported by strong cross-sector collaboration to support digital health research. Lessons learned are offered as a model for other jurisdictions with similar RWD infrastructure requirements.
format Article
id doaj-art-c46f6edfc79541b193de0cedbe0300e4
institution DOAJ
issn 1438-8871
language English
publishDate 2024-12-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj-art-c46f6edfc79541b193de0cedbe0300e42025-08-20T02:40:08ZengJMIR PublicationsJournal of Medical Internet Research1438-88712024-12-0126e5863710.2196/58637Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated LearningJodie A Austinhttps://orcid.org/0000-0003-4969-7200Elton H Lobohttps://orcid.org/0000-0003-0096-6318Mahnaz Samadbeikhttps://orcid.org/0000-0002-4756-2364Teyl Engstromhttps://orcid.org/0000-0002-1778-5006Reji Philiphttps://orcid.org/0009-0006-2969-8299Jason D Polehttps://orcid.org/0000-0002-0413-5434Clair M Sullivanhttps://orcid.org/0000-0003-2475-9989 Traditionally, medical research is based on randomized controlled trials (RCTs) for interventions such as drugs and operative procedures. However, increasingly, there is a need for health research to evolve. RCTs are expensive to run, are generally formulated with a single research question in mind, and analyze a limited dataset for a restricted period. Progressively, health decision makers are focusing on real-world data (RWD) to deliver large-scale longitudinal insights that are actionable. RWD are collected as part of routine care in real time using digital health infrastructure. For example, understanding the effectiveness of an intervention could be enhanced by combining evidence from RCTs with RWD, providing insights into long-term outcomes in real-life situations. Clinicians and researchers struggle in the digital era to harness RWD for digital health research in an efficient and ethically and morally appropriate manner. This struggle encompasses challenges such as ensuring data quality, integrating diverse sources, establishing governance policies, ensuring regulatory compliance, developing analytical capabilities, and translating insights into actionable strategies. The same way that drug trials require infrastructure to support their conduct, digital health also necessitates new and disruptive research data infrastructure. Novel methods such as common data models, federated learning, and synthetic data generation are emerging to enhance the utility of research using RWD, which are often siloed across health systems. A continued focus on data privacy and ethical compliance remains. The past 25 years have seen a notable shift from an emphasis on RCTs as the only source of practice-guiding clinical evidence to the inclusion of modern-day methods harnessing RWD. This paper describes the evolution of synthetic data, common data models, and federated learning supported by strong cross-sector collaboration to support digital health research. Lessons learned are offered as a model for other jurisdictions with similar RWD infrastructure requirements.https://www.jmir.org/2024/1/e58637
spellingShingle Jodie A Austin
Elton H Lobo
Mahnaz Samadbeik
Teyl Engstrom
Reji Philip
Jason D Pole
Clair M Sullivan
Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
Journal of Medical Internet Research
title Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
title_full Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
title_fullStr Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
title_full_unstemmed Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
title_short Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning
title_sort decades in the making the evolution of digital health research infrastructure through synthetic data common data models and federated learning
url https://www.jmir.org/2024/1/e58637
work_keys_str_mv AT jodieaaustin decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT eltonhlobo decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT mahnazsamadbeik decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT teylengstrom decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT rejiphilip decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT jasondpole decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning
AT clairmsullivan decadesinthemakingtheevolutionofdigitalhealthresearchinfrastructurethroughsyntheticdatacommondatamodelsandfederatedlearning