Machine Learning-Based Prediction of No-Show Telemedicine Encounters

Aim: This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance. Methods: We performed a retrospective observational study using anonymized data (...

Full description

Saved in:
Bibliographic Details
Main Authors: C. Mahony Reategui-Rivera, Wanting Cui, Stefan Escobar-Agreda, Leonardo Rojas-Mezarina, Joseph Finkelstein
Format: Article
Language:English
Published: Mary Ann Liebert 2025-01-01
Series:Telemedicine Reports
Subjects:
Online Access:https://www.liebertpub.com/doi/10.1089/tmr.2025.0009
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aim: This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance. Methods: We performed a retrospective observational study using anonymized data (June 2019–November 2023) from “Teleatiendo.” The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches—random forest, XGBoost, LightGBM, and anomaly detection—were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy. Results: Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (±0.001), recall of 0.654 (±0.005), specificity of 0.786 (±0.002), AUC of 0.720 (±0.002), and accuracy of 0.780 (±0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (±0.001), recall of 0.639 (±0.006), specificity of 0.805 (±0.004), AUC of 0.722 (±0.001), and accuracy of 0.799 (±0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 ± 0.002) and accuracy (0.832 ± 0.001) but recorded a lower recall (0.585 ± 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance. Conclusions: ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.
ISSN:2692-4366