Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction

Abstract Multimorbidity is characterized by the accrual of two or more long-term conditions (LTCs) in an individual. This state of health is increasingly prevalent and poses public health challenges. Adapting approaches to effectively analyse electronic health records is needed to better understand...

Full description

Saved in:
Bibliographic Details
Main Authors: Marc Delord, Abdel Douiri
Format: Article
Language:English
Published: BMC 2025-02-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-025-02476-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861954001764352
author Marc Delord
Abdel Douiri
author_facet Marc Delord
Abdel Douiri
author_sort Marc Delord
collection DOAJ
description Abstract Multimorbidity is characterized by the accrual of two or more long-term conditions (LTCs) in an individual. This state of health is increasingly prevalent and poses public health challenges. Adapting approaches to effectively analyse electronic health records is needed to better understand multimorbidity. We propose a novel unsupervised clustering approach to multiple time-to-event health records denoted as multiple state clustering analysis (MSCA). In MSCA, patients’ pairwise dissimilarities are computed using patients’ state matrices which are composed of multiple censored time-to-event indicators reflecting patients’ health history. The use of state matrices enables the analysis of an arbitrary number of LTCs without reducing patients’ health trajectories to a particular sequence of events. MSCA was applied to analyse multimorbidity associated with myocardial infarction using electronic health records of 26 LTCs, including conventional cardiovascular risk factors (CVRFs) such as diabetes and hypertension, collected from south London general practices between 2005 and 2021 in 5087 patients using the MSCA R library. We identified a typology of 11 clusters, characterised by age at onset of myocardial infarction, sequences of conventional CVRFs and non-conventional risk factors including physical and mental health conditions. Interestingly, multivariate analysis revealed that clusters were also associated with various combinations of socio-demographic characteristics including gender and ethnicity. By identifying meaningful sequences of LTCs associated with myocardial infarction and distinct socio-demographic characteristics, MSCA proves to be an effective approach to the analysis of electronic health records, with the potential to enhance our understanding of multimorbidity for improved prevention and management.
format Article
id doaj-art-3ccab4a2a22341c0a69ec71f1f86efef
institution Kabale University
issn 1471-2288
language English
publishDate 2025-02-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj-art-3ccab4a2a22341c0a69ec71f1f86efef2025-02-09T12:43:12ZengBMCBMC Medical Research Methodology1471-22882025-02-0125111610.1186/s12874-025-02476-7Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarctionMarc Delord0Abdel Douiri1School of Life Course & Population Sciences, Department of Population Health Sciences, King’s College LondonSchool of Life Course & Population Sciences, Department of Population Health Sciences, King’s College LondonAbstract Multimorbidity is characterized by the accrual of two or more long-term conditions (LTCs) in an individual. This state of health is increasingly prevalent and poses public health challenges. Adapting approaches to effectively analyse electronic health records is needed to better understand multimorbidity. We propose a novel unsupervised clustering approach to multiple time-to-event health records denoted as multiple state clustering analysis (MSCA). In MSCA, patients’ pairwise dissimilarities are computed using patients’ state matrices which are composed of multiple censored time-to-event indicators reflecting patients’ health history. The use of state matrices enables the analysis of an arbitrary number of LTCs without reducing patients’ health trajectories to a particular sequence of events. MSCA was applied to analyse multimorbidity associated with myocardial infarction using electronic health records of 26 LTCs, including conventional cardiovascular risk factors (CVRFs) such as diabetes and hypertension, collected from south London general practices between 2005 and 2021 in 5087 patients using the MSCA R library. We identified a typology of 11 clusters, characterised by age at onset of myocardial infarction, sequences of conventional CVRFs and non-conventional risk factors including physical and mental health conditions. Interestingly, multivariate analysis revealed that clusters were also associated with various combinations of socio-demographic characteristics including gender and ethnicity. By identifying meaningful sequences of LTCs associated with myocardial infarction and distinct socio-demographic characteristics, MSCA proves to be an effective approach to the analysis of electronic health records, with the potential to enhance our understanding of multimorbidity for improved prevention and management.https://doi.org/10.1186/s12874-025-02476-7Electronic health recordsMultiple state analysisJaccard dissimilarity indexWard clusteringMultimorbidityMyocardial infarction
spellingShingle Marc Delord
Abdel Douiri
Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
BMC Medical Research Methodology
Electronic health records
Multiple state analysis
Jaccard dissimilarity index
Ward clustering
Multimorbidity
Myocardial infarction
title Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
title_full Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
title_fullStr Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
title_full_unstemmed Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
title_short Multiple states clustering analysis (MSCA), an unsupervised approach to multiple time-to-event electronic health records applied to multimorbidity associated with myocardial infarction
title_sort multiple states clustering analysis msca an unsupervised approach to multiple time to event electronic health records applied to multimorbidity associated with myocardial infarction
topic Electronic health records
Multiple state analysis
Jaccard dissimilarity index
Ward clustering
Multimorbidity
Myocardial infarction
url https://doi.org/10.1186/s12874-025-02476-7
work_keys_str_mv AT marcdelord multiplestatesclusteringanalysismscaanunsupervisedapproachtomultipletimetoeventelectronichealthrecordsappliedtomultimorbidityassociatedwithmyocardialinfarction
AT abdeldouiri multiplestatesclusteringanalysismscaanunsupervisedapproachtomultipletimetoeventelectronichealthrecordsappliedtomultimorbidityassociatedwithmyocardialinfarction