Mapping the EORTC QLQ-C30 and QLQ-LC13 to the SF-6D utility index in patients with lung cancer using machine learning and traditional regression methods

Abstract Background Preference-based measures of health-related quality of life (HRQoL), such as the Short Form Six-Dimension (SF-6D) is essential for health economic evaluations. However, these measures are rarely included in clinical trials for lung cancer. This study aims to develop mapping algor...

Full description

Saved in:
Bibliographic Details
Main Authors: Longlin Jiang, Kexun Li, Simiao Lu, Zhou Hong, Yifang Wang, Qin Xie, Qin He, Sirui Wei, Aoru Zhou, Hong Kang, Xuefeng Leng, Qing Yang, Yan Miao
Format: Article
Language:English
Published: BMC 2025-07-01
Series:Health and Quality of Life Outcomes
Subjects:
Online Access:https://doi.org/10.1186/s12955-025-02394-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Preference-based measures of health-related quality of life (HRQoL), such as the Short Form Six-Dimension (SF-6D) is essential for health economic evaluations. However, these measures are rarely included in clinical trials for lung cancer. This study aims to develop mapping algorithms to predict SF-6D health utility scores from the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Core (EORTC QLQ-C30) and Quality of Life Questionnaire-Lung Cancer 13 (QLQ-LC13). Method The study sample comprised a Chinese population with lung cancer (n = 625). Traditional regression techniques, including Ordinary Least Squares regression, Generalized Linear Model, as well as machine learning techniques, such as Gradient Boosting Tree, Support Vector Regression, Ridge Regression are used. Five-fold cross-validation was performed. The performance metrics used to evaluate the models including R 2 , root mean square error (RMSE),mean absolute error (MAE) and mean absolute percentage error (MAPE) were used to screen the optimal model. Results The mean and median of SF-6D health utility values were 0.774 (SD = 0.154) and 7.795, respectively. The model with the best mapping performance was the Ridge regression model Five-fold cross-validation (CV) results show that the Ridge regression model has the best mapping performance, the final prediction indexes are R 2   = 0.753, RMSE = 0.074, MAE = 0.057, MAPE = 8.169%. Conclusions This study developed an optimized mapping algorithm to predict the utility index from the QLQ-C30 QLQ-LC13 to the SF-6D. This algorithm offers provides an effective alternative for estimating SF-6D estimation when the preference-based health utility values are unavailable.
ISSN:1477-7525