Machine learning models for predicting the risk of depressive symptoms in Chinese college students

IntroductionDepression is highly prevalent among college students, and accurately identifying risk factors is essential for timely intervention. Given the limitations of traditional linear models in managing high-dimensional data, this study employed machine learning techniques to predict depressive...

Full description

Saved in:
Bibliographic Details
Main Authors: Chengfu Yu, Xiangxuan Kong, Weijie Yu, Xingcan Ni, Jing Chen, Xiaoyan Liao
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-08-01
Series:Frontiers in Psychiatry
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1648585/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:IntroductionDepression is highly prevalent among college students, and accurately identifying risk factors is essential for timely intervention. Given the limitations of traditional linear models in managing high-dimensional data, this study employed machine learning techniques to predict depressive symptoms.MethodData were collected from 1,635 Chinese college students and included 38 sociodemographic, psychological, and social variables. Four machine- learning algorithms, Random Forest, XGBoost, LightGBM, and Support Vector Machine, were evaluated.ResultsResults showed that the Random Forest model achieved the highest discriminant performance with an AUC of 0.87 and an accuracy of 0.79, and identified key predictors such as sleep disturbance, perceived stress, experiential avoidance, and self-criticism. SHapley Additive exPlanations analysis further revealed that deteriorating sleep quality and heightened stress levels significantly increased the risk of depressive symptoms.DiscussionThese findings validate the effectiveness of Random Forest in capturing complex data interactions and offer actionable insights for targeted mental health interventions. Future studies should improve generalizability by incorporating more diverse samples and physiological biomarkers.
ISSN:1664-0640