Machine learning-driven identification of key risk factors for predicting depression among nurses

Abstract Background Since the outbreak of the coronavirus disease (COVID-19) in 2019, caused by SARS-CoV-2, the disease has become a global health threat due to its high infectivity, morbidity, and mortality rates. With China’s comprehensive relaxation of pandemic control policies in 2022, the risk...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoyan Qi, Xin Huang
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Nursing
Subjects:
Online Access:https://doi.org/10.1186/s12912-025-02957-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850269608591753216
author Xiaoyan Qi
Xin Huang
author_facet Xiaoyan Qi
Xin Huang
author_sort Xiaoyan Qi
collection DOAJ
description Abstract Background Since the outbreak of the coronavirus disease (COVID-19) in 2019, caused by SARS-CoV-2, the disease has become a global health threat due to its high infectivity, morbidity, and mortality rates. With China’s comprehensive relaxation of pandemic control policies in 2022, the risk of infection for nursing personnel has further increased. Objectives This study aims to identify risk factors associated with depression among nursing staff during the full reopening of COVID-19 in China in 2022 and to construct a predictive model to assess the risk. Methods From December 9, 2022, to April 6, 2023, a cross-sectional study was conducted in three hospitals in Anhui Province, including 293 nursing staff. The research subjects were divided into a depression group and a non-depression group, and SPSS 23.0 software was used to analyze the data of both groups. We developed four predictive machine learning models: logistic regression, support vector machine, extreme gradient boosting machine (XGBoost), and adaptive boosting (AdaBoost). The development and validation of these models utilized open-source Python libraries such as Scikit-learn and XGBoost. The models were trained and validated using a 10-fold cross-validation method, and the final model selection was based on the area under the receiver operating characteristic curve (AUC). Results The AUC values for the logistic regression, SVM, Logistic, XGBoost, and AdaBoost models were 0.86, 0.88, 0.95, and 0.93, respectively, with F1 scores of 0.79, 0.83, 0.90, and 0.89, respectively. The XGBoost model demonstrated the highest predictive accuracy. However, the study’s findings are limited by the small sample size and single location, and further validation is needed to confirm the model’s generalizability. The extreme gradient boosting machine model, tailored for common risk factors among Chinese nursing staff, provides a powerful tool for predicting the risk of depression. Conclusion This model can assist clinical managers in accurately identifying and addressing potential risk factors during and after the full reopening of COVID-19. Since the working environment and stress factors faced by nursing staff may vary across different countries, the research findings from China can promote international exchange and cooperation in the management of mental health among nursing staff, advice future research should focus on larger, multi-center studies to validate the model’s performance and explore additional risk factors. Clinical trial number Not applicable, because of this article belongs to cross-sectional study.
format Article
id doaj-art-1701ec2b249f42e794f0fb741960bae5
institution OA Journals
issn 1472-6955
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series BMC Nursing
spelling doaj-art-1701ec2b249f42e794f0fb741960bae52025-08-20T01:53:04ZengBMCBMC Nursing1472-69552025-04-0124111110.1186/s12912-025-02957-6Machine learning-driven identification of key risk factors for predicting depression among nursesXiaoyan Qi0Xin Huang1School of Nursing, Anhui Medical UniversitySchool of Management, Anhui UniversityAbstract Background Since the outbreak of the coronavirus disease (COVID-19) in 2019, caused by SARS-CoV-2, the disease has become a global health threat due to its high infectivity, morbidity, and mortality rates. With China’s comprehensive relaxation of pandemic control policies in 2022, the risk of infection for nursing personnel has further increased. Objectives This study aims to identify risk factors associated with depression among nursing staff during the full reopening of COVID-19 in China in 2022 and to construct a predictive model to assess the risk. Methods From December 9, 2022, to April 6, 2023, a cross-sectional study was conducted in three hospitals in Anhui Province, including 293 nursing staff. The research subjects were divided into a depression group and a non-depression group, and SPSS 23.0 software was used to analyze the data of both groups. We developed four predictive machine learning models: logistic regression, support vector machine, extreme gradient boosting machine (XGBoost), and adaptive boosting (AdaBoost). The development and validation of these models utilized open-source Python libraries such as Scikit-learn and XGBoost. The models were trained and validated using a 10-fold cross-validation method, and the final model selection was based on the area under the receiver operating characteristic curve (AUC). Results The AUC values for the logistic regression, SVM, Logistic, XGBoost, and AdaBoost models were 0.86, 0.88, 0.95, and 0.93, respectively, with F1 scores of 0.79, 0.83, 0.90, and 0.89, respectively. The XGBoost model demonstrated the highest predictive accuracy. However, the study’s findings are limited by the small sample size and single location, and further validation is needed to confirm the model’s generalizability. The extreme gradient boosting machine model, tailored for common risk factors among Chinese nursing staff, provides a powerful tool for predicting the risk of depression. Conclusion This model can assist clinical managers in accurately identifying and addressing potential risk factors during and after the full reopening of COVID-19. Since the working environment and stress factors faced by nursing staff may vary across different countries, the research findings from China can promote international exchange and cooperation in the management of mental health among nursing staff, advice future research should focus on larger, multi-center studies to validate the model’s performance and explore additional risk factors. Clinical trial number Not applicable, because of this article belongs to cross-sectional study.https://doi.org/10.1186/s12912-025-02957-6DepressionCOVID-19Full reopeningMachine learningRisk factorsCross-sectional study
spellingShingle Xiaoyan Qi
Xin Huang
Machine learning-driven identification of key risk factors for predicting depression among nurses
BMC Nursing
Depression
COVID-19
Full reopening
Machine learning
Risk factors
Cross-sectional study
title Machine learning-driven identification of key risk factors for predicting depression among nurses
title_full Machine learning-driven identification of key risk factors for predicting depression among nurses
title_fullStr Machine learning-driven identification of key risk factors for predicting depression among nurses
title_full_unstemmed Machine learning-driven identification of key risk factors for predicting depression among nurses
title_short Machine learning-driven identification of key risk factors for predicting depression among nurses
title_sort machine learning driven identification of key risk factors for predicting depression among nurses
topic Depression
COVID-19
Full reopening
Machine learning
Risk factors
Cross-sectional study
url https://doi.org/10.1186/s12912-025-02957-6
work_keys_str_mv AT xiaoyanqi machinelearningdrivenidentificationofkeyriskfactorsforpredictingdepressionamongnurses
AT xinhuang machinelearningdrivenidentificationofkeyriskfactorsforpredictingdepressionamongnurses