Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study

Abstract Background Given the rapid increase in the prevalence of prostate cancer (PCa), identifying its risk factors and developing suitable risk prediction models has important implications for public health. We used machine learning (ML) approach to screen participants with high risk of PCa and,...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Thi Tran, Jeonghee Lee, Junetae Kim, Sun-Young Kim, Hyunsoon Cho, Jeongseon Kim
Format: Article
Language:English
Published: BMC 2024-12-01
Series:BMC Public Health
Subjects:
Online Access:https://doi.org/10.1186/s12889-024-20852-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850253964253069312
author Tao Thi Tran
Jeonghee Lee
Junetae Kim
Sun-Young Kim
Hyunsoon Cho
Jeongseon Kim
author_facet Tao Thi Tran
Jeonghee Lee
Junetae Kim
Sun-Young Kim
Hyunsoon Cho
Jeongseon Kim
author_sort Tao Thi Tran
collection DOAJ
description Abstract Background Given the rapid increase in the prevalence of prostate cancer (PCa), identifying its risk factors and developing suitable risk prediction models has important implications for public health. We used machine learning (ML) approach to screen participants with high risk of PCa and, specifically, investigated whether participants with metabolic syndrome (MetS) exhibited an elevated PCa risk. Methods A prospective cohort study was performed with 41,837 participants in South Korea. We predicted PCa based on MetS, its components, and sociodemographic factors using Cox proportional hazards and five ML models. Integrated Brier score (IBS) and C-index were used to assess model performance. Results A total of 210 incident PCa cases were identified. We found good calibration and discrimination for all models (C-index ≥ 0.800 and IBS = 0.01). Importantly, performance increased after excluding MetS and its components from the models; the highest C-index was 0.862 for survival support vector machine. In contrast, first-degree family history of PCa, alcohol consumption, age, and income were valuable for PCa prediction. Conclusion ML models are an effective approach to develop prediction models for survival analysis. Furthermore, MetS and its components do not seem to influence PCa susceptibility, in contrast to first-degree family history of PCa, age, alcohol consumption, and income.
format Article
id doaj-art-3c126189ffb44896b9d734223ccf4068
institution OA Journals
issn 1471-2458
language English
publishDate 2024-12-01
publisher BMC
record_format Article
series BMC Public Health
spelling doaj-art-3c126189ffb44896b9d734223ccf40682025-08-20T01:57:14ZengBMCBMC Public Health1471-24582024-12-0124111210.1186/s12889-024-20852-8Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort studyTao Thi Tran0Jeonghee Lee1Junetae Kim2Sun-Young Kim3Hyunsoon Cho4Jeongseon Kim5Department of Cancer Control and Population Health, Graduate School of Cancer Science and PolicyDepartment of Cancer Biomedical Science, Graduate School of Cancer Science and PolicyDepartment of Cancer Control and Population Health, Graduate School of Cancer Science and PolicyDepartment of Cancer Control and Population Health, Graduate School of Cancer Science and PolicyDepartment of Cancer Control and Population Health, Graduate School of Cancer Science and PolicyDepartment of Cancer Biomedical Science, Graduate School of Cancer Science and PolicyAbstract Background Given the rapid increase in the prevalence of prostate cancer (PCa), identifying its risk factors and developing suitable risk prediction models has important implications for public health. We used machine learning (ML) approach to screen participants with high risk of PCa and, specifically, investigated whether participants with metabolic syndrome (MetS) exhibited an elevated PCa risk. Methods A prospective cohort study was performed with 41,837 participants in South Korea. We predicted PCa based on MetS, its components, and sociodemographic factors using Cox proportional hazards and five ML models. Integrated Brier score (IBS) and C-index were used to assess model performance. Results A total of 210 incident PCa cases were identified. We found good calibration and discrimination for all models (C-index ≥ 0.800 and IBS = 0.01). Importantly, performance increased after excluding MetS and its components from the models; the highest C-index was 0.862 for survival support vector machine. In contrast, first-degree family history of PCa, alcohol consumption, age, and income were valuable for PCa prediction. Conclusion ML models are an effective approach to develop prediction models for survival analysis. Furthermore, MetS and its components do not seem to influence PCa susceptibility, in contrast to first-degree family history of PCa, age, alcohol consumption, and income.https://doi.org/10.1186/s12889-024-20852-8Prostate cancerMachine learningMetabolic syndromeKorea
spellingShingle Tao Thi Tran
Jeonghee Lee
Junetae Kim
Sun-Young Kim
Hyunsoon Cho
Jeongseon Kim
Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
BMC Public Health
Prostate cancer
Machine learning
Metabolic syndrome
Korea
title Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
title_full Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
title_fullStr Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
title_full_unstemmed Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
title_short Machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics: a prospective cohort study
title_sort machine learning algorithms that predict the risk of prostate cancer based on metabolic syndrome and sociodemographic characteristics a prospective cohort study
topic Prostate cancer
Machine learning
Metabolic syndrome
Korea
url https://doi.org/10.1186/s12889-024-20852-8
work_keys_str_mv AT taothitran machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy
AT jeongheelee machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy
AT junetaekim machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy
AT sunyoungkim machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy
AT hyunsooncho machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy
AT jeongseonkim machinelearningalgorithmsthatpredicttheriskofprostatecancerbasedonmetabolicsyndromeandsociodemographiccharacteristicsaprospectivecohortstudy