Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study
ObjectiveThis study aimed to investigate the feasibility of developing machine learning models for non-invasive prediction of Helicobacter pylori (H pylori) infection using routinely collected adult health screening data, including demographic characteristics and clinical biomarkers, to establish a...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-06-01
|
| Series: | Frontiers in Medicine |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fmed.2025.1587540/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849468571354136576 |
|---|---|
| author | Qiaoli Wang Tao Liang Yuexi Li Peng Zhou Xiaoqin Liu |
| author_facet | Qiaoli Wang Tao Liang Yuexi Li Peng Zhou Xiaoqin Liu |
| author_sort | Qiaoli Wang |
| collection | DOAJ |
| description | ObjectiveThis study aimed to investigate the feasibility of developing machine learning models for non-invasive prediction of Helicobacter pylori (H pylori) infection using routinely collected adult health screening data, including demographic characteristics and clinical biomarkers, to establish a potential decision-support tool for clinical practice.MethodsThe data was sourced from the adult health examination records within the health management centers of the hospital. The Least Absolute Shrinkage and Selection Operator (LASSO) regression was employed for feature selection. Six distinct machine learning algorithms were utilized to construct the predictive models, and their performance was comprehensively evaluated. Additionally, the SHapley Additive Projection (SHAP) method was adopted to visualize the model features and the prediction results of individual cases.ResultsA total of 10,393 subjects were included in the dataset, with 3,278 (31.54%) having H pylori infection. After feature screening, 10 factors were selected for the prediction model. Among six machine—learning models, the Extra Trees model had the best performance, with an AUC of 0.827, Accuracy of 0.744, and Recall of 0.736. The Random Forest model also did well, with an AUC of 0.810. XGBoost attained an AUC of 0.801, indicating moderate predictive capability. SHAP analysis showed that age, WBC, ALB, gender, and wasit were the top five factors affecting H pylori infection. Higher age, WBC, wasit and lower ALB were linked to a higher infection probability. These results offer insights into H pylori infection risk factors and model performance.ConclusionThe Extra Trees classifier exhibited the optimal performance in predicting H pylori infections among the evaluated models. Additionally, the SHAP analysis enhanced the interpretability of the model, which offers valuable insights for early—stage clinical prediction and intervention strategies. |
| format | Article |
| id | doaj-art-2d15dc65e80249cdac99ce6b43ee06da |
| institution | Kabale University |
| issn | 2296-858X |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Medicine |
| spelling | doaj-art-2d15dc65e80249cdac99ce6b43ee06da2025-08-20T03:25:49ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-06-011210.3389/fmed.2025.15875401587540Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective studyQiaoli Wang0Tao Liang1Yuexi Li2Peng Zhou3Xiaoqin Liu4Health Management Center, Deyang People’s Hospital, Deyang, Sichuan, ChinaDepartment of Gastroenterology, Deyang People’s Hospital, Deyang, Sichuan, ChinaHealth Management Center, Deyang People’s Hospital, Deyang, Sichuan, ChinaHealth Management Center, Deyang People’s Hospital, Deyang, Sichuan, ChinaHealth Management Center, Deyang People’s Hospital, Deyang, Sichuan, ChinaObjectiveThis study aimed to investigate the feasibility of developing machine learning models for non-invasive prediction of Helicobacter pylori (H pylori) infection using routinely collected adult health screening data, including demographic characteristics and clinical biomarkers, to establish a potential decision-support tool for clinical practice.MethodsThe data was sourced from the adult health examination records within the health management centers of the hospital. The Least Absolute Shrinkage and Selection Operator (LASSO) regression was employed for feature selection. Six distinct machine learning algorithms were utilized to construct the predictive models, and their performance was comprehensively evaluated. Additionally, the SHapley Additive Projection (SHAP) method was adopted to visualize the model features and the prediction results of individual cases.ResultsA total of 10,393 subjects were included in the dataset, with 3,278 (31.54%) having H pylori infection. After feature screening, 10 factors were selected for the prediction model. Among six machine—learning models, the Extra Trees model had the best performance, with an AUC of 0.827, Accuracy of 0.744, and Recall of 0.736. The Random Forest model also did well, with an AUC of 0.810. XGBoost attained an AUC of 0.801, indicating moderate predictive capability. SHAP analysis showed that age, WBC, ALB, gender, and wasit were the top five factors affecting H pylori infection. Higher age, WBC, wasit and lower ALB were linked to a higher infection probability. These results offer insights into H pylori infection risk factors and model performance.ConclusionThe Extra Trees classifier exhibited the optimal performance in predicting H pylori infections among the evaluated models. Additionally, the SHAP analysis enhanced the interpretability of the model, which offers valuable insights for early—stage clinical prediction and intervention strategies.https://www.frontiersin.org/articles/10.3389/fmed.2025.1587540/fullmachine learningH pylori infectionbasic health examinationSHAP analysishealth examination |
| spellingShingle | Qiaoli Wang Tao Liang Yuexi Li Peng Zhou Xiaoqin Liu Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study Frontiers in Medicine machine learning H pylori infection basic health examination SHAP analysis health examination |
| title | Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study |
| title_full | Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study |
| title_fullStr | Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study |
| title_full_unstemmed | Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study |
| title_short | Machine learning for prediction of Helicobacter pylori infection based on basic health examination data in adults: a retrospective study |
| title_sort | machine learning for prediction of helicobacter pylori infection based on basic health examination data in adults a retrospective study |
| topic | machine learning H pylori infection basic health examination SHAP analysis health examination |
| url | https://www.frontiersin.org/articles/10.3389/fmed.2025.1587540/full |
| work_keys_str_mv | AT qiaoliwang machinelearningforpredictionofhelicobacterpyloriinfectionbasedonbasichealthexaminationdatainadultsaretrospectivestudy AT taoliang machinelearningforpredictionofhelicobacterpyloriinfectionbasedonbasichealthexaminationdatainadultsaretrospectivestudy AT yuexili machinelearningforpredictionofhelicobacterpyloriinfectionbasedonbasichealthexaminationdatainadultsaretrospectivestudy AT pengzhou machinelearningforpredictionofhelicobacterpyloriinfectionbasedonbasichealthexaminationdatainadultsaretrospectivestudy AT xiaoqinliu machinelearningforpredictionofhelicobacterpyloriinfectionbasedonbasichealthexaminationdatainadultsaretrospectivestudy |