Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra

Single-cell RNA-sequencing (scRNA-seq) coupled with robust computational analysis facilitates the characterization of phenotypic heterogeneity within tumors. Current scRNA-seq analysis pipelines are capable of identifying a myriad of malignant and non-malignant cell subtypes from single-cell profili...

Full description

Saved in:
Bibliographic Details
Main Authors: Namrata Bhattacharya, Anja Rockstroh, Sanket Suhas Deshpande, Sam Koshy Thomas, Anunay Yadav, Chitrita Goswami, Smriti Chawla, Pierre Solomon, Cynthia Fourgeux, Gaurav Ahuja, Brett Hollier, Himanshu Kumar, Antoine Roquilly, Jeremie Poschmann, Melanie Lehman, Colleen C Nelson, Debarka Sengupta
Format: Article
Language:English
Published: eLife Sciences Publications Ltd 2025-06-01
Series:eLife
Subjects:
Online Access:https://elifesciences.org/articles/98469
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850209667865640960
author Namrata Bhattacharya
Anja Rockstroh
Sanket Suhas Deshpande
Sam Koshy Thomas
Anunay Yadav
Chitrita Goswami
Smriti Chawla
Pierre Solomon
Cynthia Fourgeux
Gaurav Ahuja
Brett Hollier
Himanshu Kumar
Antoine Roquilly
Jeremie Poschmann
Melanie Lehman
Colleen C Nelson
Debarka Sengupta
author_facet Namrata Bhattacharya
Anja Rockstroh
Sanket Suhas Deshpande
Sam Koshy Thomas
Anunay Yadav
Chitrita Goswami
Smriti Chawla
Pierre Solomon
Cynthia Fourgeux
Gaurav Ahuja
Brett Hollier
Himanshu Kumar
Antoine Roquilly
Jeremie Poschmann
Melanie Lehman
Colleen C Nelson
Debarka Sengupta
author_sort Namrata Bhattacharya
collection DOAJ
description Single-cell RNA-sequencing (scRNA-seq) coupled with robust computational analysis facilitates the characterization of phenotypic heterogeneity within tumors. Current scRNA-seq analysis pipelines are capable of identifying a myriad of malignant and non-malignant cell subtypes from single-cell profiling of tumors. However, given the extent of intra-tumoral heterogeneity, it is challenging to assess the risk associated with individual cell subpopulations, primarily due to the complexity of the cancer phenotype space and the lack of clinical annotations associated with tumor scRNA-seq studies. To this end, we introduce SCellBOW, a scRNA-seq analysis framework inspired by document embedding techniques from the domain of Natural Language Processing (NLP). SCellBOW is a novel computational approach that facilitates effective identification and high-quality visualization of single-cell subpopulations. We compared SCellBOW with existing best practice methods for its ability to precisely represent phenotypically divergent cell types across multiple scRNA-seq datasets, including our in-house generated human splenocyte and matched peripheral blood mononuclear cell (PBMC) dataset. For tumor cells, SCellBOW estimates the relative risk associated with each cluster and stratifies them based on their aggressiveness. This is achieved by simulating how the presence or absence of a specific cell subpopulation influences disease prognosis. Using SCellBOW, we identified a hitherto unknown and pervasive AR−/NElow (androgen-receptor-negative, neuroendocrine-low) malignant subpopulation in metastatic prostate cancer with conspicuously high aggressiveness. Overall, the risk-stratification capabilities of SCellBOW hold promise for formulating tailored therapeutic interventions by identifying clinically relevant tumor subpopulations and their impact on prognosis.
format Article
id doaj-art-7c94b57a3fda4c91b2dd46b3b4d972e8
institution OA Journals
issn 2050-084X
language English
publishDate 2025-06-01
publisher eLife Sciences Publications Ltd
record_format Article
series eLife
spelling doaj-art-7c94b57a3fda4c91b2dd46b3b4d972e82025-08-20T02:09:58ZengeLife Sciences Publications LtdeLife2050-084X2025-06-011310.7554/eLife.98469Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebraNamrata Bhattacharya0https://orcid.org/0000-0002-5666-2551Anja Rockstroh1Sanket Suhas Deshpande2Sam Koshy Thomas3Anunay Yadav4Chitrita Goswami5Smriti Chawla6Pierre Solomon7Cynthia Fourgeux8Gaurav Ahuja9https://orcid.org/0000-0002-2837-9361Brett Hollier10Himanshu Kumar11https://orcid.org/0000-0001-5246-2694Antoine Roquilly12https://orcid.org/0000-0002-1029-6242Jeremie Poschmann13https://orcid.org/0000-0002-9613-5297Melanie Lehman14Colleen C Nelson15Debarka Sengupta16https://orcid.org/0000-0002-6353-5411Australian Prostate Cancer Research Centre-Queensland, Faculty of Health, School of Biomedical Sciences, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Australia; Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, India; Translational Research Institute, Princess Alexandra Hospital, Woolloongabba, AustraliaAustralian Prostate Cancer Research Centre-Queensland, Faculty of Health, School of Biomedical Sciences, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Australia; Translational Research Institute, Princess Alexandra Hospital, Woolloongabba, AustraliaDepartment of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, IndiaSchool of Mathematical Sciences, The University of Adelaide, Adelaide, AustraliaDepartment of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, IndiaDepartment of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, IndiaCenter for Computational Biomedicine, Harvard Medical School, Boston, United StatesNantes Université, CHU Nantes, INSERM, Center for Research in Transplantation and Translational Immunology, UMR, Nantes, FranceNantes Université, CHU Nantes, INSERM, Center for Research in Transplantation and Translational Immunology, UMR, Nantes, FranceDepartment of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, India; Centre for Artificial Intelligence, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, IndiaAustralian Prostate Cancer Research Centre-Queensland, Faculty of Health, School of Biomedical Sciences, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Australia; Translational Research Institute, Princess Alexandra Hospital, Woolloongabba, AustraliaLaboratory of Immunology and Infectious Disease Biology, Department of Biological Sciences, Indian Institute of Science Education and Research (IISER), Bhopal, IndiaNantes Université, CHU Nantes, INSERM, Center for Research in Transplantation and Translational Immunology, UMR, Nantes, FranceNantes Université, CHU Nantes, INSERM, Center for Research in Transplantation and Translational Immunology, UMR, Nantes, FranceAustralian Prostate Cancer Research Centre-Queensland, Faculty of Health, School of Biomedical Sciences, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Australia; Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, CanadaAustralian Prostate Cancer Research Centre-Queensland, Faculty of Health, School of Biomedical Sciences, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Australia; Translational Research Institute, Princess Alexandra Hospital, Woolloongabba, AustraliaDepartment of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, India; Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, India; Centre for Artificial Intelligence, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, IndiaSingle-cell RNA-sequencing (scRNA-seq) coupled with robust computational analysis facilitates the characterization of phenotypic heterogeneity within tumors. Current scRNA-seq analysis pipelines are capable of identifying a myriad of malignant and non-malignant cell subtypes from single-cell profiling of tumors. However, given the extent of intra-tumoral heterogeneity, it is challenging to assess the risk associated with individual cell subpopulations, primarily due to the complexity of the cancer phenotype space and the lack of clinical annotations associated with tumor scRNA-seq studies. To this end, we introduce SCellBOW, a scRNA-seq analysis framework inspired by document embedding techniques from the domain of Natural Language Processing (NLP). SCellBOW is a novel computational approach that facilitates effective identification and high-quality visualization of single-cell subpopulations. We compared SCellBOW with existing best practice methods for its ability to precisely represent phenotypically divergent cell types across multiple scRNA-seq datasets, including our in-house generated human splenocyte and matched peripheral blood mononuclear cell (PBMC) dataset. For tumor cells, SCellBOW estimates the relative risk associated with each cluster and stratifies them based on their aggressiveness. This is achieved by simulating how the presence or absence of a specific cell subpopulation influences disease prognosis. Using SCellBOW, we identified a hitherto unknown and pervasive AR−/NElow (androgen-receptor-negative, neuroendocrine-low) malignant subpopulation in metastatic prostate cancer with conspicuously high aggressiveness. Overall, the risk-stratification capabilities of SCellBOW hold promise for formulating tailored therapeutic interventions by identifying clinically relevant tumor subpopulations and their impact on prognosis.https://elifesciences.org/articles/98469single-cell RNA-seqrisk stratificationtransfer learningprostate cancermarker-freelanguage model
spellingShingle Namrata Bhattacharya
Anja Rockstroh
Sanket Suhas Deshpande
Sam Koshy Thomas
Anunay Yadav
Chitrita Goswami
Smriti Chawla
Pierre Solomon
Cynthia Fourgeux
Gaurav Ahuja
Brett Hollier
Himanshu Kumar
Antoine Roquilly
Jeremie Poschmann
Melanie Lehman
Colleen C Nelson
Debarka Sengupta
Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
eLife
single-cell RNA-seq
risk stratification
transfer learning
prostate cancer
marker-free
language model
title Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
title_full Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
title_fullStr Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
title_full_unstemmed Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
title_short Artificial intelligence driven tumor risk stratification from single-cell transcriptomics using phenotype algebra
title_sort artificial intelligence driven tumor risk stratification from single cell transcriptomics using phenotype algebra
topic single-cell RNA-seq
risk stratification
transfer learning
prostate cancer
marker-free
language model
url https://elifesciences.org/articles/98469
work_keys_str_mv AT namratabhattacharya artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT anjarockstroh artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT sanketsuhasdeshpande artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT samkoshythomas artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT anunayyadav artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT chitritagoswami artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT smritichawla artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT pierresolomon artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT cynthiafourgeux artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT gauravahuja artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT bretthollier artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT himanshukumar artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT antoineroquilly artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT jeremieposchmann artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT melanielehman artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT colleencnelson artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra
AT debarkasengupta artificialintelligencedriventumorriskstratificationfromsinglecelltranscriptomicsusingphenotypealgebra