Diagnostic framework to validate clinical machine learning models locally on temporally stamped data

Abstract Background Real-world medical environments such as oncology are highly dynamic due to rapid changes in medical practice, technologies, and patient characteristics. This variability, if not addressed, can result in data shifts with potentially poor model performance. Presently, there are few...

Full description

Saved in:
Bibliographic Details
Main Authors: Maximilian Schuessler, Scott Fleming, Shannon Meyer, Tina Seto, Tina Hernandez-Boussard
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Communications Medicine
Online Access:https://doi.org/10.1038/s43856-025-00965-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849238443064819712
author Maximilian Schuessler
Scott Fleming
Shannon Meyer
Tina Seto
Tina Hernandez-Boussard
author_facet Maximilian Schuessler
Scott Fleming
Shannon Meyer
Tina Seto
Tina Hernandez-Boussard
author_sort Maximilian Schuessler
collection DOAJ
description Abstract Background Real-world medical environments such as oncology are highly dynamic due to rapid changes in medical practice, technologies, and patient characteristics. This variability, if not addressed, can result in data shifts with potentially poor model performance. Presently, there are few easy-to-implement, model-agnostic diagnostic frameworks to vet machine learning models for future applicability and temporal consistency. Methods We extracted clinical data from EHR for a cohort of over 24,000 patients who received antineoplastic therapy within a distinct year. The label of this study are acute care utilization (ACU) events, i.e., emergency department visits and hospitalizations, within 180 days of treatment initiation. Our cross-sectional data spans treatment initiation points from 2010–2022. We implemented three models within our validation framework: Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Results Here, we introduce a model-agnostic diagnostic framework to validate clinical machine learning models on time-stamped data, consisting of four stages. First, the framework evaluates performance by partitioning data from multiple years into training and validation cohorts. Second, it characterizes the temporal evolution of patient outcomes and characteristics. Third, model longevity and trade-offs between data quantity and recency are explored. Finally, feature importance and data valuation algorithms are applied for feature reduction and data quality assessment. When applied to predicting ACU in cancer patients, the framework highlights fluctuations in features, labels, and data values over time. Conclusions The work in this study emphasizes the importance of data timeliness and relevance. The results on ACU in cancer patients show moderate signs of drift and corroborate the relevance of temporal considerations when validating machine learning models for deployment at the point of care.
format Article
id doaj-art-1be2b99b665e407a82c0500f6b1a769e
institution Kabale University
issn 2730-664X
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Communications Medicine
spelling doaj-art-1be2b99b665e407a82c0500f6b1a769e2025-08-20T04:01:36ZengNature PortfolioCommunications Medicine2730-664X2025-07-015111210.1038/s43856-025-00965-wDiagnostic framework to validate clinical machine learning models locally on temporally stamped dataMaximilian Schuessler0Scott Fleming1Shannon Meyer2Tina Seto3Tina Hernandez-Boussard4Department of Biomedical Data Science, Stanford UniversityDepartment of Biomedical Data Science, Stanford UniversityDepartment of Statistics, Stanford UniversityResearch Technology, Stanford Health CareDepartment of Biomedical Data Science, Stanford UniversityAbstract Background Real-world medical environments such as oncology are highly dynamic due to rapid changes in medical practice, technologies, and patient characteristics. This variability, if not addressed, can result in data shifts with potentially poor model performance. Presently, there are few easy-to-implement, model-agnostic diagnostic frameworks to vet machine learning models for future applicability and temporal consistency. Methods We extracted clinical data from EHR for a cohort of over 24,000 patients who received antineoplastic therapy within a distinct year. The label of this study are acute care utilization (ACU) events, i.e., emergency department visits and hospitalizations, within 180 days of treatment initiation. Our cross-sectional data spans treatment initiation points from 2010–2022. We implemented three models within our validation framework: Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Results Here, we introduce a model-agnostic diagnostic framework to validate clinical machine learning models on time-stamped data, consisting of four stages. First, the framework evaluates performance by partitioning data from multiple years into training and validation cohorts. Second, it characterizes the temporal evolution of patient outcomes and characteristics. Third, model longevity and trade-offs between data quantity and recency are explored. Finally, feature importance and data valuation algorithms are applied for feature reduction and data quality assessment. When applied to predicting ACU in cancer patients, the framework highlights fluctuations in features, labels, and data values over time. Conclusions The work in this study emphasizes the importance of data timeliness and relevance. The results on ACU in cancer patients show moderate signs of drift and corroborate the relevance of temporal considerations when validating machine learning models for deployment at the point of care.https://doi.org/10.1038/s43856-025-00965-w
spellingShingle Maximilian Schuessler
Scott Fleming
Shannon Meyer
Tina Seto
Tina Hernandez-Boussard
Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
Communications Medicine
title Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
title_full Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
title_fullStr Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
title_full_unstemmed Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
title_short Diagnostic framework to validate clinical machine learning models locally on temporally stamped data
title_sort diagnostic framework to validate clinical machine learning models locally on temporally stamped data
url https://doi.org/10.1038/s43856-025-00965-w
work_keys_str_mv AT maximilianschuessler diagnosticframeworktovalidateclinicalmachinelearningmodelslocallyontemporallystampeddata
AT scottfleming diagnosticframeworktovalidateclinicalmachinelearningmodelslocallyontemporallystampeddata
AT shannonmeyer diagnosticframeworktovalidateclinicalmachinelearningmodelslocallyontemporallystampeddata
AT tinaseto diagnosticframeworktovalidateclinicalmachinelearningmodelslocallyontemporallystampeddata
AT tinahernandezboussard diagnosticframeworktovalidateclinicalmachinelearningmodelslocallyontemporallystampeddata