A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection

Abstract In spatial transcriptomics, many algorithms are available for clustering cells into groups based on gene expression and location, although not without limitations. Such limitations include having to know the number of clusters, limiting inference to only one donor, and being unable to ident...

Full description

Saved in:
Bibliographic Details
Main Authors: Donald Turner, Yang Ni
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-11693-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849766932143669248
author Donald Turner
Yang Ni
author_facet Donald Turner
Yang Ni
author_sort Donald Turner
collection DOAJ
description Abstract In spatial transcriptomics, many algorithms are available for clustering cells into groups based on gene expression and location, although not without limitations. Such limitations include having to know the number of clusters, limiting inference to only one donor, and being unable to identify information common to multiple donors. To address these limitations, we propose a Bayesian nonparametric clustering algorithm capable of incorporating spatial transcriptomic data from multiple donors, which can identify clusters both common to all donors and idiosyncratic for each donor, features a variable selection of informative genes, and is able to determine the number of clusters automatically. Our method makes use of a Bayesian nonparametric method for combining inference across donors and a partition distribution indexed by pairwise distance information to cluster both within and across multiple spatial transcriptomics datasets. In our simulations and a real-data application, we show that our method can outperform other commonly used clustering algorithms.
format Article
id doaj-art-0eb73deaf26649fbadda7cbd2676e5f7
institution DOAJ
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-0eb73deaf26649fbadda7cbd2676e5f72025-08-20T03:04:25ZengNature PortfolioScientific Reports2045-23222025-07-0115111210.1038/s41598-025-11693-5A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selectionDonald Turner0Yang Ni1Texas A&M University, Department of StatisticsTexas A&M University, Department of StatisticsAbstract In spatial transcriptomics, many algorithms are available for clustering cells into groups based on gene expression and location, although not without limitations. Such limitations include having to know the number of clusters, limiting inference to only one donor, and being unable to identify information common to multiple donors. To address these limitations, we propose a Bayesian nonparametric clustering algorithm capable of incorporating spatial transcriptomic data from multiple donors, which can identify clusters both common to all donors and idiosyncratic for each donor, features a variable selection of informative genes, and is able to determine the number of clusters automatically. Our method makes use of a Bayesian nonparametric method for combining inference across donors and a partition distribution indexed by pairwise distance information to cluster both within and across multiple spatial transcriptomics datasets. In our simulations and a real-data application, we show that our method can outperform other commonly used clustering algorithms.https://doi.org/10.1038/s41598-025-11693-5
spellingShingle Donald Turner
Yang Ni
A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
Scientific Reports
title A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
title_full A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
title_fullStr A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
title_full_unstemmed A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
title_short A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
title_sort bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection
url https://doi.org/10.1038/s41598-025-11693-5
work_keys_str_mv AT donaldturner abayesiannonparametricmethodforjointlyclusteringmultiplespatialtranscriptomicdatasetsandsimultaneousgeneselection
AT yangni abayesiannonparametricmethodforjointlyclusteringmultiplespatialtranscriptomicdatasetsandsimultaneousgeneselection
AT donaldturner bayesiannonparametricmethodforjointlyclusteringmultiplespatialtranscriptomicdatasetsandsimultaneousgeneselection
AT yangni bayesiannonparametricmethodforjointlyclusteringmultiplespatialtranscriptomicdatasetsandsimultaneousgeneselection