A Bayesian nonparametric method for jointly clustering multiple spatial transcriptomic datasets and simultaneous gene selection

Abstract In spatial transcriptomics, many algorithms are available for clustering cells into groups based on gene expression and location, although not without limitations. Such limitations include having to know the number of clusters, limiting inference to only one donor, and being unable to ident...

Full description

Saved in:
Bibliographic Details
Main Authors: Donald Turner, Yang Ni
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-11693-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract In spatial transcriptomics, many algorithms are available for clustering cells into groups based on gene expression and location, although not without limitations. Such limitations include having to know the number of clusters, limiting inference to only one donor, and being unable to identify information common to multiple donors. To address these limitations, we propose a Bayesian nonparametric clustering algorithm capable of incorporating spatial transcriptomic data from multiple donors, which can identify clusters both common to all donors and idiosyncratic for each donor, features a variable selection of informative genes, and is able to determine the number of clusters automatically. Our method makes use of a Bayesian nonparametric method for combining inference across donors and a partition distribution indexed by pairwise distance information to cluster both within and across multiple spatial transcriptomics datasets. In our simulations and a real-data application, we show that our method can outperform other commonly used clustering algorithms.
ISSN:2045-2322