Transfer learning for accelerated failure time model with microarray data
Abstract Background In microarray prognostic studies, researchers aim to identify genes associated with disease progression. However, due to the rarity of certain diseases and the cost of sample collection, researchers often face the challenge of limited sample size, which may prevent accurate estim...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-03-01
|
| Series: | BMC Bioinformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12859-025-06056-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849390266604060672 |
|---|---|
| author | Yan-Bo Pei Zheng-Yang Yu Jun-Shan Shen |
| author_facet | Yan-Bo Pei Zheng-Yang Yu Jun-Shan Shen |
| author_sort | Yan-Bo Pei |
| collection | DOAJ |
| description | Abstract Background In microarray prognostic studies, researchers aim to identify genes associated with disease progression. However, due to the rarity of certain diseases and the cost of sample collection, researchers often face the challenge of limited sample size, which may prevent accurate estimation and risk assessment. This challenge necessitates methods that can leverage information from external data (i.e., source cohorts) to improve gene selection and risk assessment based on the current sample (i.e., target cohort). Method We propose a transfer learning method for the accelerated failure time (AFT) model to enhance the fit on the target cohort by adaptively borrowing information from the source cohorts. We use a Leave-One-Out cross validation based procedure to evaluate the relative stability of selected genes and overall predictive power. Conclusion In simulation studies, the transfer learning method for the AFT model can correctly identify a small number of genes, its estimation error is smaller than the estimation error obtained without using the source cohorts. Furthermore, the proposed method demonstrates satisfactory accuracy and robustness in addressing heterogeneity across the cohorts compared to the method that directly combines the target and the source cohorts in the AFT model. We analyze the GSE88770 and GSE25055 data using the proposed method. The selected genes are relatively stable, and the proposed method can make an overall satisfactory risk prediction. |
| format | Article |
| id | doaj-art-8342e72822b844ca80ee13df0c71e34d |
| institution | Kabale University |
| issn | 1471-2105 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Bioinformatics |
| spelling | doaj-art-8342e72822b844ca80ee13df0c71e34d2025-08-20T03:41:43ZengBMCBMC Bioinformatics1471-21052025-03-0126111910.1186/s12859-025-06056-wTransfer learning for accelerated failure time model with microarray dataYan-Bo Pei0Zheng-Yang Yu1Jun-Shan Shen2School of Statistics, Capital University of Economics and BusinessSchool of Statistics, Capital University of Economics and BusinessSchool of Statistics, Capital University of Economics and BusinessAbstract Background In microarray prognostic studies, researchers aim to identify genes associated with disease progression. However, due to the rarity of certain diseases and the cost of sample collection, researchers often face the challenge of limited sample size, which may prevent accurate estimation and risk assessment. This challenge necessitates methods that can leverage information from external data (i.e., source cohorts) to improve gene selection and risk assessment based on the current sample (i.e., target cohort). Method We propose a transfer learning method for the accelerated failure time (AFT) model to enhance the fit on the target cohort by adaptively borrowing information from the source cohorts. We use a Leave-One-Out cross validation based procedure to evaluate the relative stability of selected genes and overall predictive power. Conclusion In simulation studies, the transfer learning method for the AFT model can correctly identify a small number of genes, its estimation error is smaller than the estimation error obtained without using the source cohorts. Furthermore, the proposed method demonstrates satisfactory accuracy and robustness in addressing heterogeneity across the cohorts compared to the method that directly combines the target and the source cohorts in the AFT model. We analyze the GSE88770 and GSE25055 data using the proposed method. The selected genes are relatively stable, and the proposed method can make an overall satisfactory risk prediction.https://doi.org/10.1186/s12859-025-06056-wSurvival analysisAuxiliary studiesGene expression dataWeighted least squaresTransfer learning |
| spellingShingle | Yan-Bo Pei Zheng-Yang Yu Jun-Shan Shen Transfer learning for accelerated failure time model with microarray data BMC Bioinformatics Survival analysis Auxiliary studies Gene expression data Weighted least squares Transfer learning |
| title | Transfer learning for accelerated failure time model with microarray data |
| title_full | Transfer learning for accelerated failure time model with microarray data |
| title_fullStr | Transfer learning for accelerated failure time model with microarray data |
| title_full_unstemmed | Transfer learning for accelerated failure time model with microarray data |
| title_short | Transfer learning for accelerated failure time model with microarray data |
| title_sort | transfer learning for accelerated failure time model with microarray data |
| topic | Survival analysis Auxiliary studies Gene expression data Weighted least squares Transfer learning |
| url | https://doi.org/10.1186/s12859-025-06056-w |
| work_keys_str_mv | AT yanbopei transferlearningforacceleratedfailuretimemodelwithmicroarraydata AT zhengyangyu transferlearningforacceleratedfailuretimemodelwithmicroarraydata AT junshanshen transferlearningforacceleratedfailuretimemodelwithmicroarraydata |