Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models
Abstract The objective of this study was to employ machine learning to identify shared differentially expressed genes (DEGs) in prostate cancer (PCa) initiation and castration resistance, aiming to establish a robust prognostic model and enhance understanding of patient prognosis for personalized tr...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-02-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-90444-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850067483363377152 |
|---|---|
| author | Changhui Fan Zhiheng Huang Han Xu Tianhe Zhang Haiyang Wei Junfeng Gao Changbao Xu Changhui Fan |
| author_facet | Changhui Fan Zhiheng Huang Han Xu Tianhe Zhang Haiyang Wei Junfeng Gao Changbao Xu Changhui Fan |
| author_sort | Changhui Fan |
| collection | DOAJ |
| description | Abstract The objective of this study was to employ machine learning to identify shared differentially expressed genes (DEGs) in prostate cancer (PCa) initiation and castration resistance, aiming to establish a robust prognostic model and enhance understanding of patient prognosis for personalized treatment strategies. mRNA transcriptome data associated with Castration-Resistant Prostate Cancer (CRPC) were obtained from the GEO database. Differential expression analysis was conducted using the limma R package to compare normal prostate samples with PCa samples, and PCa samples with CRPC samples. Next, we applied LASSO regression, univariate, and multivariate COX regression analyses to pinpoint genes linked to prognosis and build prognostic models. Validation was performed using the TCGA_PRAD dataset to confirm expression differences of hub genes and explore their correlation with clinical variables and prognostic significance. We successfully established a prostate cancer risk prognostic model containing seven genes (KIF4A, UBE2C, FAM72D, CCDC78, HOXD9, LIX1 and SLC5A8) and verified its accuracy on an independent data set. The results of calibration curve and decision curve show that the model has potential clinical application value. The nomogram can accurately predict the prognosis of patients. Additionally, elevated expression of KIF4A, UBE2C, and FAM72D, or reduced expression of LIX1, correlated with advanced pathological T and N stages, clinical T stage, prostate-specific antigen (PSA) level, age at diagnosis, Gleason score, and shorter progression-free interval (PFI) (P < 0.05). By integrating bioinformatics analysis and clinical data, we not only established a reliable prognostic model for prostate cancer but also identified key genes pivotal in disease progression and treatment resistance. These findings provide novel insights and methodologies for assessing prognosis and tailoring treatment strategies for prostate cancer patients. |
| format | Article |
| id | doaj-art-a97ef811da804e5d8a8c8b9d44a140a7 |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-a97ef811da804e5d8a8c8b9d44a140a72025-08-20T02:48:18ZengNature PortfolioScientific Reports2045-23222025-02-0115111310.1038/s41598-025-90444-yMachine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic modelsChanghui Fan0Zhiheng Huang1Han Xu2Tianhe Zhang3Haiyang Wei4Junfeng Gao5Changbao Xu6Changhui Fan7Second Affiliated Hospital of Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversitySchool of Basic Medical Sciences, Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversitySecond Affiliated Hospital of Zhengzhou UniversityAbstract The objective of this study was to employ machine learning to identify shared differentially expressed genes (DEGs) in prostate cancer (PCa) initiation and castration resistance, aiming to establish a robust prognostic model and enhance understanding of patient prognosis for personalized treatment strategies. mRNA transcriptome data associated with Castration-Resistant Prostate Cancer (CRPC) were obtained from the GEO database. Differential expression analysis was conducted using the limma R package to compare normal prostate samples with PCa samples, and PCa samples with CRPC samples. Next, we applied LASSO regression, univariate, and multivariate COX regression analyses to pinpoint genes linked to prognosis and build prognostic models. Validation was performed using the TCGA_PRAD dataset to confirm expression differences of hub genes and explore their correlation with clinical variables and prognostic significance. We successfully established a prostate cancer risk prognostic model containing seven genes (KIF4A, UBE2C, FAM72D, CCDC78, HOXD9, LIX1 and SLC5A8) and verified its accuracy on an independent data set. The results of calibration curve and decision curve show that the model has potential clinical application value. The nomogram can accurately predict the prognosis of patients. Additionally, elevated expression of KIF4A, UBE2C, and FAM72D, or reduced expression of LIX1, correlated with advanced pathological T and N stages, clinical T stage, prostate-specific antigen (PSA) level, age at diagnosis, Gleason score, and shorter progression-free interval (PFI) (P < 0.05). By integrating bioinformatics analysis and clinical data, we not only established a reliable prognostic model for prostate cancer but also identified key genes pivotal in disease progression and treatment resistance. These findings provide novel insights and methodologies for assessing prognosis and tailoring treatment strategies for prostate cancer patients.https://doi.org/10.1038/s41598-025-90444-yProstate cancerCastration resistanceTCGADifferentially expressed genesPrognostic modeling |
| spellingShingle | Changhui Fan Zhiheng Huang Han Xu Tianhe Zhang Haiyang Wei Junfeng Gao Changbao Xu Changhui Fan Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models Scientific Reports Prostate cancer Castration resistance TCGA Differentially expressed genes Prognostic modeling |
| title | Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models |
| title_full | Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models |
| title_fullStr | Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models |
| title_full_unstemmed | Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models |
| title_short | Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models |
| title_sort | machine learning based identification of co expressed genes in prostate cancer and crpc and construction of prognostic models |
| topic | Prostate cancer Castration resistance TCGA Differentially expressed genes Prognostic modeling |
| url | https://doi.org/10.1038/s41598-025-90444-y |
| work_keys_str_mv | AT changhuifan machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT zhihenghuang machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT hanxu machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT tianhezhang machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT haiyangwei machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT junfenggao machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT changbaoxu machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels AT changhuifan machinelearningbasedidentificationofcoexpressedgenesinprostatecancerandcrpcandconstructionofprognosticmodels |