Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users

Abstract Background Supervised machine learning is increasingly being used to estimate clinical predictive models. Several supervised machine learning models involve hyper-parameters, whose values must be judiciously specified to ensure adequate predictive performance. Objective To compare several (...

Full description

Saved in:

Bibliographic Details
Main Authors:	Christopher Meaney, Xuesong Wang, Jun Guan, Therese A. Stukel
Format:	Article
Language:	English
Published:	BMC 2025-05-01
Series:	BMC Medical Research Methodology
Subjects:	Supervised machine learning Clinical predictive modelling Prediction model Hyper-parameter optimization (HPO) Hyper-parameter tuning (HPT) Extreme gradient boosting classifier
Online Access:	https://doi.org/10.1186/s12874-025-02561-x
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849309833541451776
author	Christopher Meaney Xuesong Wang Jun Guan Therese A. Stukel
author_facet	Christopher Meaney Xuesong Wang Jun Guan Therese A. Stukel
author_sort	Christopher Meaney
collection	DOAJ
description	Abstract Background Supervised machine learning is increasingly being used to estimate clinical predictive models. Several supervised machine learning models involve hyper-parameters, whose values must be judiciously specified to ensure adequate predictive performance. Objective To compare several (nine) hyper-parameter optimization (HPO) methods, for tuning the hyper-parameters of an extreme gradient boosting model, with application to predicting high-need high-cost health care users. Methods Extreme gradient boosting models were estimated using a randomly sampled training dataset. Models were separately trained using nine different HPO methods: 1) random sampling, 2) simulated annealing, 3) quasi-Monte Carlo sampling, 4-5) two variations of Bayesian hyper-parameter optimization via tree-Parzen estimation, 6-7) two implementations of Bayesian hyper-parameter optimization via Gaussian processes, 8) Bayesian hyper-parameter optimization via random forests, and 9) the covariance matrix adaptation evolutionary strategy. For each HPO method, we estimated 100 extreme gradient boosting models at different hyper-parameter configurations; and evaluated model performance using an AUC metric on a randomly sampled validation dataset. Using the best model identified by each HPO method, we evaluated generalization performance in terms of discrimination and calibration metrics on a randomly sampled held-out test dataset (internal validation) and a temporally independent dataset (external validation). Results The extreme gradient boosting model estimated using default hyper-parameter settings had reasonable discrimination (AUC=0.82) but was not well calibrated. Hyper-parameter tuning using any HPO algorithm/sampler improved model discrimination (AUC=0.84), resulted in models with near perfect calibration, and consistently identified features predictive of high-need high-cost health care users. Conclusions In our study, all HPO algorithms resulted in similar gains in model performance relative to baseline models. This finding likely relates to our study dataset having a large sample size, a relatively small number of features, and a strong signal to noise ratio; and would likely apply to other datasets with similar characteristics.
format	Article
id	doaj-art-e0034432aa2f49639f7beaa72357d3be
institution	Kabale University
issn	1471-2288
language	English
publishDate	2025-05-01
publisher	BMC
record_format	Article
series	BMC Medical Research Methodology
spelling	doaj-art-e0034432aa2f49639f7beaa72357d3be2025-08-20T03:53:57ZengBMCBMC Medical Research Methodology1471-22882025-05-0125111310.1186/s12874-025-02561-xComparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care usersChristopher Meaney0Xuesong Wang1Jun Guan2Therese A. Stukel3Department of Family and Community Medicine, University of TorontoICESICESICESAbstract Background Supervised machine learning is increasingly being used to estimate clinical predictive models. Several supervised machine learning models involve hyper-parameters, whose values must be judiciously specified to ensure adequate predictive performance. Objective To compare several (nine) hyper-parameter optimization (HPO) methods, for tuning the hyper-parameters of an extreme gradient boosting model, with application to predicting high-need high-cost health care users. Methods Extreme gradient boosting models were estimated using a randomly sampled training dataset. Models were separately trained using nine different HPO methods: 1) random sampling, 2) simulated annealing, 3) quasi-Monte Carlo sampling, 4-5) two variations of Bayesian hyper-parameter optimization via tree-Parzen estimation, 6-7) two implementations of Bayesian hyper-parameter optimization via Gaussian processes, 8) Bayesian hyper-parameter optimization via random forests, and 9) the covariance matrix adaptation evolutionary strategy. For each HPO method, we estimated 100 extreme gradient boosting models at different hyper-parameter configurations; and evaluated model performance using an AUC metric on a randomly sampled validation dataset. Using the best model identified by each HPO method, we evaluated generalization performance in terms of discrimination and calibration metrics on a randomly sampled held-out test dataset (internal validation) and a temporally independent dataset (external validation). Results The extreme gradient boosting model estimated using default hyper-parameter settings had reasonable discrimination (AUC=0.82) but was not well calibrated. Hyper-parameter tuning using any HPO algorithm/sampler improved model discrimination (AUC=0.84), resulted in models with near perfect calibration, and consistently identified features predictive of high-need high-cost health care users. Conclusions In our study, all HPO algorithms resulted in similar gains in model performance relative to baseline models. This finding likely relates to our study dataset having a large sample size, a relatively small number of features, and a strong signal to noise ratio; and would likely apply to other datasets with similar characteristics.https://doi.org/10.1186/s12874-025-02561-xSupervised machine learningClinical predictive modellingPrediction modelHyper-parameter optimization (HPO)Hyper-parameter tuning (HPT)Extreme gradient boosting classifier
spellingShingle	Christopher Meaney Xuesong Wang Jun Guan Therese A. Stukel Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users BMC Medical Research Methodology Supervised machine learning Clinical predictive modelling Prediction model Hyper-parameter optimization (HPO) Hyper-parameter tuning (HPT) Extreme gradient boosting classifier
title	Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users
title_full	Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users
title_fullStr	Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users
title_full_unstemmed	Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users
title_short	Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users
title_sort	comparison of methods for tuning machine learning model hyper parameters with application to predicting high need high cost health care users
topic	Supervised machine learning Clinical predictive modelling Prediction model Hyper-parameter optimization (HPO) Hyper-parameter tuning (HPT) Extreme gradient boosting classifier
url	https://doi.org/10.1186/s12874-025-02561-x
work_keys_str_mv	AT christophermeaney comparisonofmethodsfortuningmachinelearningmodelhyperparameterswithapplicationtopredictinghighneedhighcosthealthcareusers AT xuesongwang comparisonofmethodsfortuningmachinelearningmodelhyperparameterswithapplicationtopredictinghighneedhighcosthealthcareusers AT junguan comparisonofmethodsfortuningmachinelearningmodelhyperparameterswithapplicationtopredictinghighneedhighcosthealthcareusers AT thereseastukel comparisonofmethodsfortuningmachinelearningmodelhyperparameterswithapplicationtopredictinghighneedhighcosthealthcareusers

Comparison of methods for tuning machine learning model hyper-parameters: with application to predicting high-need high-cost health care users

Similar Items