Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder

Abstract In real-world psychiatric practice, patients may experience complex treatment journeys, including various diagnoses and lines of therapy. Insurance claims databases could potentially provide insight into outcomes of psychiatric treatment processes, but the diversity of event sequences restr...

Full description

Saved in:
Bibliographic Details
Main Authors: Matthew Littman, Huy-Binh Nguyen, Joanna Campbell, Katelyn R. Keyloun
Format: Article
Language:English
Published: SpringerOpen 2025-05-01
Series:Brain Informatics
Subjects:
Online Access:https://doi.org/10.1186/s40708-025-00258-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850269083282440192
author Matthew Littman
Huy-Binh Nguyen
Joanna Campbell
Katelyn R. Keyloun
author_facet Matthew Littman
Huy-Binh Nguyen
Joanna Campbell
Katelyn R. Keyloun
author_sort Matthew Littman
collection DOAJ
description Abstract In real-world psychiatric practice, patients may experience complex treatment journeys, including various diagnoses and lines of therapy. Insurance claims databases could potentially provide insight into outcomes of psychiatric treatment processes, but the diversity of event sequences restricts analyses with currently available methods. Here, we developed a novel kernel k-means clustering algorithm for event sequences that can accommodate highly diverse event types and sequence lengths. The approach, Divisive Optimized Clustering using Kernel K-means for Event Sequences (DOCKKES), also leverages a novel performance metric, the transition score, which measures sequence coherence in individual clusters. The performance of DOCKKES was evaluated in the context of bipolar I disorder, which is characterized by heterogeneous treatment journeys. We conducted a retrospective, observational analysis of a large sample (n = 31,578) of patients with bipolar I disorder from the MarketScan® Commercial Database. Using insurance claims, bipolar episode diagnoses and mental health–related lines of therapy were identified as events of interest for patient clustering. The dataset included 202,122 events; 75% of the cohort experienced unique treatment journeys. Based on an optimal run, DOCKKES identified 16 treatment journey clusters, which were evenly split for initial manic/mixed or depressive episodes (8 clusters each) and varied in sequence length and early lines of therapy. Variability across clusters was also observed for demographics, comorbidities, and mental health–related healthcare resource utilization and cost. This proof-of-concept study demonstrated the use of DOCKKES for integrating information from large datasets, enabling comparisons between patient clusters and evaluation of real-world treatment journeys in the context of evidence-based guidelines.
format Article
id doaj-art-5c179130c82e49949e67b7eca8f8c1c8
institution OA Journals
issn 2198-4018
2198-4026
language English
publishDate 2025-05-01
publisher SpringerOpen
record_format Article
series Brain Informatics
spelling doaj-art-5c179130c82e49949e67b7eca8f8c1c82025-08-20T01:53:15ZengSpringerOpenBrain Informatics2198-40182198-40262025-05-0112111810.1186/s40708-025-00258-xTreatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorderMatthew Littman0Huy-Binh Nguyen1Joanna Campbell2Katelyn R. Keyloun3AbbVieAbbVieAbbVieAbbVieAbstract In real-world psychiatric practice, patients may experience complex treatment journeys, including various diagnoses and lines of therapy. Insurance claims databases could potentially provide insight into outcomes of psychiatric treatment processes, but the diversity of event sequences restricts analyses with currently available methods. Here, we developed a novel kernel k-means clustering algorithm for event sequences that can accommodate highly diverse event types and sequence lengths. The approach, Divisive Optimized Clustering using Kernel K-means for Event Sequences (DOCKKES), also leverages a novel performance metric, the transition score, which measures sequence coherence in individual clusters. The performance of DOCKKES was evaluated in the context of bipolar I disorder, which is characterized by heterogeneous treatment journeys. We conducted a retrospective, observational analysis of a large sample (n = 31,578) of patients with bipolar I disorder from the MarketScan® Commercial Database. Using insurance claims, bipolar episode diagnoses and mental health–related lines of therapy were identified as events of interest for patient clustering. The dataset included 202,122 events; 75% of the cohort experienced unique treatment journeys. Based on an optimal run, DOCKKES identified 16 treatment journey clusters, which were evenly split for initial manic/mixed or depressive episodes (8 clusters each) and varied in sequence length and early lines of therapy. Variability across clusters was also observed for demographics, comorbidities, and mental health–related healthcare resource utilization and cost. This proof-of-concept study demonstrated the use of DOCKKES for integrating information from large datasets, enabling comparisons between patient clusters and evaluation of real-world treatment journeys in the context of evidence-based guidelines.https://doi.org/10.1186/s40708-025-00258-xBipolar 1 disorderClustering algorithmMachine learningReal-world evidenceSequence clusteringTreatment journey
spellingShingle Matthew Littman
Huy-Binh Nguyen
Joanna Campbell
Katelyn R. Keyloun
Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
Brain Informatics
Bipolar 1 disorder
Clustering algorithm
Machine learning
Real-world evidence
Sequence clustering
Treatment journey
title Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
title_full Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
title_fullStr Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
title_full_unstemmed Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
title_short Treatment journey clustering with a novel kernel k-means machine learning algorithm: a retrospective analysis of insurance claims in bipolar I disorder
title_sort treatment journey clustering with a novel kernel k means machine learning algorithm a retrospective analysis of insurance claims in bipolar i disorder
topic Bipolar 1 disorder
Clustering algorithm
Machine learning
Real-world evidence
Sequence clustering
Treatment journey
url https://doi.org/10.1186/s40708-025-00258-x
work_keys_str_mv AT matthewlittman treatmentjourneyclusteringwithanovelkernelkmeansmachinelearningalgorithmaretrospectiveanalysisofinsuranceclaimsinbipolaridisorder
AT huybinhnguyen treatmentjourneyclusteringwithanovelkernelkmeansmachinelearningalgorithmaretrospectiveanalysisofinsuranceclaimsinbipolaridisorder
AT joannacampbell treatmentjourneyclusteringwithanovelkernelkmeansmachinelearningalgorithmaretrospectiveanalysisofinsuranceclaimsinbipolaridisorder
AT katelynrkeyloun treatmentjourneyclusteringwithanovelkernelkmeansmachinelearningalgorithmaretrospectiveanalysisofinsuranceclaimsinbipolaridisorder