A machine learning approach for multimodal data fusion for survival prediction in cancer patients
Abstract Technological advancements of the past decade have transformed cancer research, improving patient survival predictions through genotyping and multimodal data analysis. However, there is no comprehensive machine-learning pipeline for comparing methods to enhance these predictions. To address...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | npj Precision Oncology |
| Online Access: | https://doi.org/10.1038/s41698-025-00917-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850277705637953536 |
|---|---|
| author | Nikolaos Nikolaou Domingo Salazar Harish RaviPrakash Miguel Gonçalves Rob Mulla Nikolay Burlutskiy Natasha Markuzon Etai Jacob |
| author_facet | Nikolaos Nikolaou Domingo Salazar Harish RaviPrakash Miguel Gonçalves Rob Mulla Nikolay Burlutskiy Natasha Markuzon Etai Jacob |
| author_sort | Nikolaos Nikolaou |
| collection | DOAJ |
| description | Abstract Technological advancements of the past decade have transformed cancer research, improving patient survival predictions through genotyping and multimodal data analysis. However, there is no comprehensive machine-learning pipeline for comparing methods to enhance these predictions. To address this, a versatile pipeline using The Cancer Genome Atlas (TCGA) data was developed, incorporating various data modalities such as transcripts, proteins, metabolites, and clinical factors. This approach manages challenges like high dimensionality, small sample sizes, and data heterogeneity. By applying different feature extraction and fusion strategies, notably late fusion models, the effectiveness of integrating diverse data types was demonstrated. Late fusion models consistently outperformed single-modality approaches in TCGA lung, breast, and pan-cancer datasets, offering higher accuracy and robustness. This research highlights the potential of comprehensive multimodal data integration in precision oncology to improve survival predictions for cancer patients. The study provides a reusable pipeline for the research community, suggesting future work on larger cohorts. |
| format | Article |
| id | doaj-art-44420503b81b47b3b68c8d1f44692fca |
| institution | OA Journals |
| issn | 2397-768X |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Precision Oncology |
| spelling | doaj-art-44420503b81b47b3b68c8d1f44692fca2025-08-20T01:49:46ZengNature Portfolionpj Precision Oncology2397-768X2025-05-019111410.1038/s41698-025-00917-6A machine learning approach for multimodal data fusion for survival prediction in cancer patientsNikolaos Nikolaou0Domingo Salazar1Harish RaviPrakash2Miguel Gonçalves3Rob Mulla4Nikolay Burlutskiy5Natasha Markuzon6Etai Jacob7Oncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaOncology Data Science, Oncology R&D, AstraZenecaAbstract Technological advancements of the past decade have transformed cancer research, improving patient survival predictions through genotyping and multimodal data analysis. However, there is no comprehensive machine-learning pipeline for comparing methods to enhance these predictions. To address this, a versatile pipeline using The Cancer Genome Atlas (TCGA) data was developed, incorporating various data modalities such as transcripts, proteins, metabolites, and clinical factors. This approach manages challenges like high dimensionality, small sample sizes, and data heterogeneity. By applying different feature extraction and fusion strategies, notably late fusion models, the effectiveness of integrating diverse data types was demonstrated. Late fusion models consistently outperformed single-modality approaches in TCGA lung, breast, and pan-cancer datasets, offering higher accuracy and robustness. This research highlights the potential of comprehensive multimodal data integration in precision oncology to improve survival predictions for cancer patients. The study provides a reusable pipeline for the research community, suggesting future work on larger cohorts.https://doi.org/10.1038/s41698-025-00917-6 |
| spellingShingle | Nikolaos Nikolaou Domingo Salazar Harish RaviPrakash Miguel Gonçalves Rob Mulla Nikolay Burlutskiy Natasha Markuzon Etai Jacob A machine learning approach for multimodal data fusion for survival prediction in cancer patients npj Precision Oncology |
| title | A machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| title_full | A machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| title_fullStr | A machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| title_full_unstemmed | A machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| title_short | A machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| title_sort | machine learning approach for multimodal data fusion for survival prediction in cancer patients |
| url | https://doi.org/10.1038/s41698-025-00917-6 |
| work_keys_str_mv | AT nikolaosnikolaou amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT domingosalazar amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT harishraviprakash amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT miguelgoncalves amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT robmulla amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT nikolayburlutskiy amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT natashamarkuzon amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT etaijacob amachinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT nikolaosnikolaou machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT domingosalazar machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT harishraviprakash machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT miguelgoncalves machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT robmulla machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT nikolayburlutskiy machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT natashamarkuzon machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients AT etaijacob machinelearningapproachformultimodaldatafusionforsurvivalpredictionincancerpatients |