Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach

Abstract Background Aging has become a global trend, and depression, as an accompanying issue, poses a significant threat to the health of middle-aged and older adults. Existing studies primarily rely on statistical methods such as logistic regression for small-scale data analysis, while research on...

Full description

Saved in:
Bibliographic Details
Main Authors: Ling Zhang, Ruigang Wei, Jingwen Zhou, Lin Tan, Xiaolong Che, Minqinag Zhang, Xiaoyue Ning, Zhiliang Zhong
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Psychology
Subjects:
Online Access:https://doi.org/10.1186/s40359-025-02691-3
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850148485779685376
author Ling Zhang
Ruigang Wei
Jingwen Zhou
Lin Tan
Xiaolong Che
Minqinag Zhang
Xiaoyue Ning
Zhiliang Zhong
author_facet Ling Zhang
Ruigang Wei
Jingwen Zhou
Lin Tan
Xiaolong Che
Minqinag Zhang
Xiaoyue Ning
Zhiliang Zhong
author_sort Ling Zhang
collection DOAJ
description Abstract Background Aging has become a global trend, and depression, as an accompanying issue, poses a significant threat to the health of middle-aged and older adults. Existing studies primarily rely on statistical methods such as logistic regression for small-scale data analysis, while research on the application of machine learning in large-scale data remains limited. Therefore, this study employs machine learning methods to explore the risk factors for depression among middle-aged and older adults in China. Methods Using a two-step hybrid model combining long short-term memory (LSTM) and machine learning (ML), we compared 20 depression risk/protective factors in a balanced panel dataset of middle-aged and elderly Chinese adults (N = 3706; aged 45–94; 64.65% female; 41.20% middle-aged) from the China Health and Retirement Longitudinal Study (CHARLS). Data were collected across five waves (2011, 2013, 2015, 2018, and 2020). The LSTM model predicted risk factors for the fifth wave via data from the preceding four waves. Five ML models were then used to classify depression (yes/no) based on these factors, which included demographic, lifestyle, health, and socioeconomic variables. Results The LSTM model effectively predicted depression-related variables (mean square error = 0.067). The average AUC of the five ML models ranged from 0.78 to 0.82. The key predictive factors were disability, life satisfaction, activities of daily living (ADL) impairment, chronic diseases, and self-reported memory. For the middle-aged group, the top three factors were disability, life satisfaction, and chronic diseases; for the Older people group, they were life satisfaction, chronic diseases, and ADL impairment. Conclusion The two-step hybrid model ("LSTM + ML") effectively predicted depression over 2 years via demographic and health data, aiding early diagnosis and intervention.
format Article
id doaj-art-5c8e106c2cb24313ae02e673caedc0f3
institution OA Journals
issn 2050-7283
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series BMC Psychology
spelling doaj-art-5c8e106c2cb24313ae02e673caedc0f32025-08-20T02:27:14ZengBMCBMC Psychology2050-72832025-04-0113111510.1186/s40359-025-02691-3Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approachLing Zhang0Ruigang Wei1Jingwen Zhou2Lin Tan3Xiaolong Che4Minqinag Zhang5Xiaoyue Ning6Zhiliang Zhong7School of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsSchool of Software and Internet of Things, Jiangxi University of Finance and EconomicsAbstract Background Aging has become a global trend, and depression, as an accompanying issue, poses a significant threat to the health of middle-aged and older adults. Existing studies primarily rely on statistical methods such as logistic regression for small-scale data analysis, while research on the application of machine learning in large-scale data remains limited. Therefore, this study employs machine learning methods to explore the risk factors for depression among middle-aged and older adults in China. Methods Using a two-step hybrid model combining long short-term memory (LSTM) and machine learning (ML), we compared 20 depression risk/protective factors in a balanced panel dataset of middle-aged and elderly Chinese adults (N = 3706; aged 45–94; 64.65% female; 41.20% middle-aged) from the China Health and Retirement Longitudinal Study (CHARLS). Data were collected across five waves (2011, 2013, 2015, 2018, and 2020). The LSTM model predicted risk factors for the fifth wave via data from the preceding four waves. Five ML models were then used to classify depression (yes/no) based on these factors, which included demographic, lifestyle, health, and socioeconomic variables. Results The LSTM model effectively predicted depression-related variables (mean square error = 0.067). The average AUC of the five ML models ranged from 0.78 to 0.82. The key predictive factors were disability, life satisfaction, activities of daily living (ADL) impairment, chronic diseases, and self-reported memory. For the middle-aged group, the top three factors were disability, life satisfaction, and chronic diseases; for the Older people group, they were life satisfaction, chronic diseases, and ADL impairment. Conclusion The two-step hybrid model ("LSTM + ML") effectively predicted depression over 2 years via demographic and health data, aiding early diagnosis and intervention.https://doi.org/10.1186/s40359-025-02691-3Depression symptomsMachine learningDeep learningLSTMCNNLongitudinal study
spellingShingle Ling Zhang
Ruigang Wei
Jingwen Zhou
Lin Tan
Xiaolong Che
Minqinag Zhang
Xiaoyue Ning
Zhiliang Zhong
Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
BMC Psychology
Depression symptoms
Machine learning
Deep learning
LSTM
CNN
Longitudinal study
title Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
title_full Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
title_fullStr Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
title_full_unstemmed Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
title_short Predicting depression and unravelling its heterogeneous influences in middle-aged and older people populations: a machine learning approach
title_sort predicting depression and unravelling its heterogeneous influences in middle aged and older people populations a machine learning approach
topic Depression symptoms
Machine learning
Deep learning
LSTM
CNN
Longitudinal study
url https://doi.org/10.1186/s40359-025-02691-3
work_keys_str_mv AT lingzhang predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT ruigangwei predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT jingwenzhou predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT lintan predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT xiaolongche predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT minqinagzhang predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT xiaoyuening predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach
AT zhiliangzhong predictingdepressionandunravellingitsheterogeneousinfluencesinmiddleagedandolderpeoplepopulationsamachinelearningapproach