Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcomes

Managing chronic diseases requires ongoing monitoring of disease activity and therapeutic responses to optimize treatment plans. With the growing availability of disease-modifying therapies, it is crucial to investigate comparative effectiveness and long-term outcomes beyond those availab...

Full description

Saved in:
Bibliographic Details
Main Authors: Feiqing Huang, Jue Hou, Ningxuan Zhou, Kimberly Greco, Chenyu Lin, Sara Morini Sweet, Jun Wen, Lechen Shen, Nicolas Gonzalez, Sinian Zhang, Katherine P Liao, Tianrun Cai, Zongqi Xia, Florence T Bourgeois, Tianxi Cai
Format: Article
Language:English
Published: JMIR Publications 2025-05-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e71873
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Managing chronic diseases requires ongoing monitoring of disease activity and therapeutic responses to optimize treatment plans. With the growing availability of disease-modifying therapies, it is crucial to investigate comparative effectiveness and long-term outcomes beyond those available from randomized clinical trials. We introduce a comprehensive pipeline for generating reproducible and generalizable real-world evidence on disease outcomes by leveraging electronic health record data. The pipeline first generates scalable disease outcomes by linking electronic health record data with registry data containing a small sample of labeled outcomes. It then applies causal analysis using these scalable outcomes to evaluate therapies for chronic diseases. The implementation of the pipeline is illustrated in a case study based on multiple sclerosis. Our approach addresses challenges in real-world evidence generation for disease activity of chronic conditions, specifically the lack of direct observations on key outcomes and biases arising from imperfect or incomplete data. We present advanced machine learning techniques such as semisupervised and ensemble methods to impute missing outcome data, further incorporating steps for calibrated causal analyses and bias correction.
ISSN:1438-8871