Deep learning powered single-cell clustering framework with enhanced accuracy and stability

Abstract Single-cell RNA sequencing (scRNA-seq) has revolutionized the field of cellular diversity research. Unsupervised clustering, a key technique in this exploration, allows for the identification of distinct cell types within a population. Graph-based deep clustering methods have shown promise...

Full description

Saved in:
Bibliographic Details
Main Authors: Yi Zhang, Xi Feng, Yin Wang, Kai Shi
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-87672-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823862352872734720
author Yi Zhang
Xi Feng
Yin Wang
Kai Shi
author_facet Yi Zhang
Xi Feng
Yin Wang
Kai Shi
author_sort Yi Zhang
collection DOAJ
description Abstract Single-cell RNA sequencing (scRNA-seq) has revolutionized the field of cellular diversity research. Unsupervised clustering, a key technique in this exploration, allows for the identification of distinct cell types within a population. Graph-based deep clustering methods have shown promise in preserving the structural relationships between cells (nodes) within the data. However, these methods often neglect the inherent distribution of nodes in the graph, leading to incomplete representations of cell populations. Additionally, conventional graph convolutional networks (GCNs) can suffer from oversmoothing, a phenomenon where the network loses the ability to differentiate between samples with similar expression profiles. To address these limitations, we proposed scG-cluster, an innovative deep structural clustering method. This method incorporates two key innovations: (1) Dual-topology adjacency graph: scG-cluster integrates information about node distribution into the traditional adjacency graph used by GCNs. This enriches the graph representation by capturing the spatial relationships between cells in addition to their pairwise similarities. (2) Dual-topology adaptive graph convolutional network (TAGCN): The framework employs a TAGCN architecture with residual concatenation. This network utilizes an attention mechanism to dynamically weight features within the graph, focusing on the most informative aspects for clustering. Additionally, residual connections are implemented to combat oversmoothing, ensuring the network retains the ability to distinguish between subtle differences in cell expression profiles. Furthermore, scG-cluster iteratively refines the clustering centers, leading to enhanced stability and accuracy in the final cluster assignments. Extensive evaluations on six diverse scRNA-seq datasets demonstrate that scG-cluster consistently outperforms existing state-of-the-art methods in terms of both clustering accuracy and scalability. Ablation studies are also conducted to validate the significant contributions of both the residual connections and the attention mechanism to the overall performance of the model. The source code for scG-cluster is publicly available at https://github.com/xixi-wq/scG-cluster .
format Article
id doaj-art-ebe82c178b5746fa8fe7d6217c80c6c7
institution Kabale University
issn 2045-2322
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-ebe82c178b5746fa8fe7d6217c80c6c72025-02-09T12:35:34ZengNature PortfolioScientific Reports2045-23222025-02-0115112010.1038/s41598-025-87672-7Deep learning powered single-cell clustering framework with enhanced accuracy and stabilityYi Zhang0Xi Feng1Yin Wang2Kai Shi3Guilin University of TechnologyGuilin University of TechnologyGuilin University of TechnologyGuilin University of TechnologyAbstract Single-cell RNA sequencing (scRNA-seq) has revolutionized the field of cellular diversity research. Unsupervised clustering, a key technique in this exploration, allows for the identification of distinct cell types within a population. Graph-based deep clustering methods have shown promise in preserving the structural relationships between cells (nodes) within the data. However, these methods often neglect the inherent distribution of nodes in the graph, leading to incomplete representations of cell populations. Additionally, conventional graph convolutional networks (GCNs) can suffer from oversmoothing, a phenomenon where the network loses the ability to differentiate between samples with similar expression profiles. To address these limitations, we proposed scG-cluster, an innovative deep structural clustering method. This method incorporates two key innovations: (1) Dual-topology adjacency graph: scG-cluster integrates information about node distribution into the traditional adjacency graph used by GCNs. This enriches the graph representation by capturing the spatial relationships between cells in addition to their pairwise similarities. (2) Dual-topology adaptive graph convolutional network (TAGCN): The framework employs a TAGCN architecture with residual concatenation. This network utilizes an attention mechanism to dynamically weight features within the graph, focusing on the most informative aspects for clustering. Additionally, residual connections are implemented to combat oversmoothing, ensuring the network retains the ability to distinguish between subtle differences in cell expression profiles. Furthermore, scG-cluster iteratively refines the clustering centers, leading to enhanced stability and accuracy in the final cluster assignments. Extensive evaluations on six diverse scRNA-seq datasets demonstrate that scG-cluster consistently outperforms existing state-of-the-art methods in terms of both clustering accuracy and scalability. Ablation studies are also conducted to validate the significant contributions of both the residual connections and the attention mechanism to the overall performance of the model. The source code for scG-cluster is publicly available at https://github.com/xixi-wq/scG-cluster .https://doi.org/10.1038/s41598-025-87672-7Deep structural clusteringUnsupervised clusteringscRNA-seqTAGCNAttention mechanismCellular heterogeneity
spellingShingle Yi Zhang
Xi Feng
Yin Wang
Kai Shi
Deep learning powered single-cell clustering framework with enhanced accuracy and stability
Scientific Reports
Deep structural clustering
Unsupervised clustering
scRNA-seq
TAGCN
Attention mechanism
Cellular heterogeneity
title Deep learning powered single-cell clustering framework with enhanced accuracy and stability
title_full Deep learning powered single-cell clustering framework with enhanced accuracy and stability
title_fullStr Deep learning powered single-cell clustering framework with enhanced accuracy and stability
title_full_unstemmed Deep learning powered single-cell clustering framework with enhanced accuracy and stability
title_short Deep learning powered single-cell clustering framework with enhanced accuracy and stability
title_sort deep learning powered single cell clustering framework with enhanced accuracy and stability
topic Deep structural clustering
Unsupervised clustering
scRNA-seq
TAGCN
Attention mechanism
Cellular heterogeneity
url https://doi.org/10.1038/s41598-025-87672-7
work_keys_str_mv AT yizhang deeplearningpoweredsinglecellclusteringframeworkwithenhancedaccuracyandstability
AT xifeng deeplearningpoweredsinglecellclusteringframeworkwithenhancedaccuracyandstability
AT yinwang deeplearningpoweredsinglecellclusteringframeworkwithenhancedaccuracyandstability
AT kaishi deeplearningpoweredsinglecellclusteringframeworkwithenhancedaccuracyandstability