Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms
Abstract Background This study aimed to develop predictive models with robust generalization capabilities for assessing the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms. Methods Data were collected from two centers and categorized into development and va...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2024-12-01
|
| Series: | European Journal of Medical Research |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s40001-024-02218-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850035297128022016 |
|---|---|
| author | Haobo Kong Yong Li Ya Shen Jingjing Pan Min Liang Zhi Geng Yanbei Zhang |
| author_facet | Haobo Kong Yong Li Ya Shen Jingjing Pan Min Liang Zhi Geng Yanbei Zhang |
| author_sort | Haobo Kong |
| collection | DOAJ |
| description | Abstract Background This study aimed to develop predictive models with robust generalization capabilities for assessing the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms. Methods Data were collected from two centers and categorized into development and validation cohorts. Using the development cohort, candidate variables were selected via the Recursive Feature Elimination (RFE) method. Five machine learning algorithms, logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and support vector machine (SVM), were utilized to construct the predictive models. Model performance was evaluated through nested cross-validation and area under the curve (AUC) metrics, supplemented by interpretations using Shapley Additive explanations (SHAP) and line charts of AUC values. Models were subjected to external validation using an independent validation group, facilitating the early identification and management of pulmonary embolism risks in tuberculosis patients. Results Data from 694 patients were used for model development, and 236 patients from the validation group met the enrollment criteria. The optimal subset of variables identified included D-dimer, smoking status, dyspnea, age, sex, diabetes, platelet count, cough, fibrinogen, hemoglobin, hemoptysis, hypertension, chronic obstructive pulmonary disease (COPD), and chest pain. The RF model outperformed others, achieving an AUC of 0.839 (95% CI 0.780–0.899) and maintaining the highest average performance in external fivefold cross-validation (AUC: 0.906 ± 0.041). Conclusions The RF model demonstrates high and consistent effectiveness in predicting pulmonary embolism risk in tuberculosis patients. |
| format | Article |
| id | doaj-art-2c84295ecac04b1daf5f50e57cba7ffd |
| institution | DOAJ |
| issn | 2047-783X |
| language | English |
| publishDate | 2024-12-01 |
| publisher | BMC |
| record_format | Article |
| series | European Journal of Medical Research |
| spelling | doaj-art-2c84295ecac04b1daf5f50e57cba7ffd2025-08-20T02:57:32ZengBMCEuropean Journal of Medical Research2047-783X2024-12-012911910.1186/s40001-024-02218-3Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithmsHaobo Kong0Yong Li1Ya Shen2Jingjing Pan3Min Liang4Zhi Geng5Yanbei Zhang6Department of Geriatric Respiratory and Critical Care, Anhui Geriatric Institute, The First Affiliated Hospital of Anhui Medical UniversityDepartment of Geriatric Respiratory and Critical Care, Anhui Geriatric Institute, The First Affiliated Hospital of Anhui Medical UniversityDepartment of Respiratory and Critical Care Medicine, Fuyang Infectious Disease Clinical College of Anhui Medical UniversityDepartment of Respiratory Intensive Care Unit, Anhui Medical University Clinical College of Chest & Anhui Chest HospitalDepartment of Tuberculosis, Anhui Medical University Clinical College of Chest & Anhui Chest HospitalDepartment of Neurology, The First Affiliated Hospital of Anhui Medical UniversityDepartment of Geriatric Respiratory and Critical Care, Anhui Geriatric Institute, The First Affiliated Hospital of Anhui Medical UniversityAbstract Background This study aimed to develop predictive models with robust generalization capabilities for assessing the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms. Methods Data were collected from two centers and categorized into development and validation cohorts. Using the development cohort, candidate variables were selected via the Recursive Feature Elimination (RFE) method. Five machine learning algorithms, logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and support vector machine (SVM), were utilized to construct the predictive models. Model performance was evaluated through nested cross-validation and area under the curve (AUC) metrics, supplemented by interpretations using Shapley Additive explanations (SHAP) and line charts of AUC values. Models were subjected to external validation using an independent validation group, facilitating the early identification and management of pulmonary embolism risks in tuberculosis patients. Results Data from 694 patients were used for model development, and 236 patients from the validation group met the enrollment criteria. The optimal subset of variables identified included D-dimer, smoking status, dyspnea, age, sex, diabetes, platelet count, cough, fibrinogen, hemoglobin, hemoptysis, hypertension, chronic obstructive pulmonary disease (COPD), and chest pain. The RF model outperformed others, achieving an AUC of 0.839 (95% CI 0.780–0.899) and maintaining the highest average performance in external fivefold cross-validation (AUC: 0.906 ± 0.041). Conclusions The RF model demonstrates high and consistent effectiveness in predicting pulmonary embolism risk in tuberculosis patients.https://doi.org/10.1186/s40001-024-02218-3Machine learningPulmonary embolismPulmonary tuberculosisRisk prediction |
| spellingShingle | Haobo Kong Yong Li Ya Shen Jingjing Pan Min Liang Zhi Geng Yanbei Zhang Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms European Journal of Medical Research Machine learning Pulmonary embolism Pulmonary tuberculosis Risk prediction |
| title | Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| title_full | Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| title_fullStr | Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| title_full_unstemmed | Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| title_short | Predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| title_sort | predicting the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms |
| topic | Machine learning Pulmonary embolism Pulmonary tuberculosis Risk prediction |
| url | https://doi.org/10.1186/s40001-024-02218-3 |
| work_keys_str_mv | AT haobokong predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT yongli predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT yashen predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT jingjingpan predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT minliang predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT zhigeng predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms AT yanbeizhang predictingtheriskofpulmonaryembolisminpatientswithtuberculosisusingmachinelearningalgorithms |