Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey

Background: Hypertension is a serious chronic disease that can significantly lead to various cardiovascular diseases, affecting vital organs such as the heart, brain, and kidneys. Our goal is to predict the risk of new onset hypertension using machine learning algorithms and identify the characteris...

Full description

Saved in:
Bibliographic Details
Main Authors: Manhui Zhang, Xian Xia, Qiqi Wang, Yue Pan, Guanyi Zhang, Zhigang Wang
Format: Article
Language:English
Published: Komiyama Printing Co. Ltd 2025-01-01
Series:Environmental Health and Preventive Medicine
Subjects:
Online Access:https://www.jstage.jst.go.jp/article/ehpm/30/0/30_24-00270/_html/-char/en
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832582311918436352
author Manhui Zhang
Xian Xia
Qiqi Wang
Yue Pan
Guanyi Zhang
Zhigang Wang
author_facet Manhui Zhang
Xian Xia
Qiqi Wang
Yue Pan
Guanyi Zhang
Zhigang Wang
author_sort Manhui Zhang
collection DOAJ
description Background: Hypertension is a serious chronic disease that can significantly lead to various cardiovascular diseases, affecting vital organs such as the heart, brain, and kidneys. Our goal is to predict the risk of new onset hypertension using machine learning algorithms and identify the characteristics of patients with new onset hypertension. Methods: We analyzed data from the 2011 China Health and Nutrition Survey cohort of individuals who were not hypertensive at baseline and had follow-up results available for prediction by 2015. We tested and evaluated the performance of four traditional machine learning algorithms commonly used in epidemiological studies: Logistic Regression, Support Vector Machine, XGBoost, LightGBM, and two deep learning algorithms: TabNet and AMFormer model. We modeled using 16 and 29 features, respectively. SHAP values were applied to select key features associated with new onset hypertension. Results: A total of 4,982 participants were included in the analysis, of whom 1,017 developed hypertension during the 4-year follow-up. Among the 16-feature models, Logistic Regression had the highest AUC of 0.784(0.775∼0.806). In the 29-feature prediction models, AMFormer performed the best with an AUC of 0.802(0.795∼0.820), and also scored the highest in MCC (0.417, 95%CI: 0.400∼0.434) and F1 (0.503, 95%CI: 0.484∼0.505) metrics, demonstrating superior overall performance compared to the other models. Additionally, key features selected based on the AMFormer, such as age, province, waist circumference, urban or rural location, education level, employment status, weight, WHR, and BMI, played significant roles. Conclusion: We used the AMFormer model for the first time in predicting new onset hypertension and achieved the best results among the six algorithms tested. Key features associated with new onset hypertension can be determined through this algorithm. The practice of machine learning algorithms can further enhance the predictive efficacy of diseases and identify risk factors for diseases.
format Article
id doaj-art-77aca0397fe94f35a08d254092cb2332
institution Kabale University
issn 1342-078X
1347-4715
language English
publishDate 2025-01-01
publisher Komiyama Printing Co. Ltd
record_format Article
series Environmental Health and Preventive Medicine
spelling doaj-art-77aca0397fe94f35a08d254092cb23322025-01-30T00:05:38ZengKomiyama Printing Co. LtdEnvironmental Health and Preventive Medicine1342-078X1347-47152025-01-01303310.1265/ehpm.24-00270ehpmApplication of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition SurveyManhui Zhang0Xian Xia1Qiqi Wang2Yue Pan3Guanyi Zhang4Zhigang Wang5Department of Disease Control and Prevention, The Seventh Medical Center of Chinese PLA General HospitalDepartment of Disease Control and Prevention, The Seventh Medical Center of Chinese PLA General HospitalOffice of Epidemiology (Technical Guidance Office for Patriotic Health Work), Chinese Center for Disease Control and PreventionDepartment of Disease Control and Prevention, The Seventh Medical Center of Chinese PLA General HospitalDepartment of Neurology, Beijing Tiantan Hospital, Capital Medical UniversityDepartment of Disease Control and Prevention, The Seventh Medical Center of Chinese PLA General HospitalBackground: Hypertension is a serious chronic disease that can significantly lead to various cardiovascular diseases, affecting vital organs such as the heart, brain, and kidneys. Our goal is to predict the risk of new onset hypertension using machine learning algorithms and identify the characteristics of patients with new onset hypertension. Methods: We analyzed data from the 2011 China Health and Nutrition Survey cohort of individuals who were not hypertensive at baseline and had follow-up results available for prediction by 2015. We tested and evaluated the performance of four traditional machine learning algorithms commonly used in epidemiological studies: Logistic Regression, Support Vector Machine, XGBoost, LightGBM, and two deep learning algorithms: TabNet and AMFormer model. We modeled using 16 and 29 features, respectively. SHAP values were applied to select key features associated with new onset hypertension. Results: A total of 4,982 participants were included in the analysis, of whom 1,017 developed hypertension during the 4-year follow-up. Among the 16-feature models, Logistic Regression had the highest AUC of 0.784(0.775∼0.806). In the 29-feature prediction models, AMFormer performed the best with an AUC of 0.802(0.795∼0.820), and also scored the highest in MCC (0.417, 95%CI: 0.400∼0.434) and F1 (0.503, 95%CI: 0.484∼0.505) metrics, demonstrating superior overall performance compared to the other models. Additionally, key features selected based on the AMFormer, such as age, province, waist circumference, urban or rural location, education level, employment status, weight, WHR, and BMI, played significant roles. Conclusion: We used the AMFormer model for the first time in predicting new onset hypertension and achieved the best results among the six algorithms tested. Key features associated with new onset hypertension can be determined through this algorithm. The practice of machine learning algorithms can further enhance the predictive efficacy of diseases and identify risk factors for diseases.https://www.jstage.jst.go.jp/article/ehpm/30/0/30_24-00270/_html/-char/enmachine learning algorithmspredictionnew onset hypertensionchns
spellingShingle Manhui Zhang
Xian Xia
Qiqi Wang
Yue Pan
Guanyi Zhang
Zhigang Wang
Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
Environmental Health and Preventive Medicine
machine learning algorithms
prediction
new onset hypertension
chns
title Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
title_full Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
title_fullStr Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
title_full_unstemmed Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
title_short Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey
title_sort application of machine learning algorithms in predicting new onset hypertension a study based on the china health and nutrition survey
topic machine learning algorithms
prediction
new onset hypertension
chns
url https://www.jstage.jst.go.jp/article/ehpm/30/0/30_24-00270/_html/-char/en
work_keys_str_mv AT manhuizhang applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey
AT xianxia applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey
AT qiqiwang applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey
AT yuepan applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey
AT guanyizhang applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey
AT zhigangwang applicationofmachinelearningalgorithmsinpredictingnewonsethypertensionastudybasedonthechinahealthandnutritionsurvey