G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data
Single-cell RNA sequencing (scRNA-seq) technology measures the expression of thousands of genes at the cellular level. Analyzing single-cell transcriptome allows the identification of heterogeneous cell groups, cellular-level regulations, and the trajectory of cell development. An important aspect i...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2024-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2024.9020011 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832543052374212608 |
---|---|
author | Shuqing He Jicong Fan Tianwei Yu |
author_facet | Shuqing He Jicong Fan Tianwei Yu |
author_sort | Shuqing He |
collection | DOAJ |
description | Single-cell RNA sequencing (scRNA-seq) technology measures the expression of thousands of genes at the cellular level. Analyzing single-cell transcriptome allows the identification of heterogeneous cell groups, cellular-level regulations, and the trajectory of cell development. An important aspect in the analyses of scRNA-seq data is the clustering of cells, which is hampered by issues, such as high dimensionality, cell type imbalance, redundancy, and dropout. Given cells of each type are functionally consistent, incorporating biological relations among genes may improve the clustering results. In light of this, we have developed a deep-embedded clustering method, G3DC. This method combines a graph regularization based on the pre-existing gene network and a feature selector based on the ℓ2,1-norm regularization, along with a reconstruction loss, to generate a discriminatory and informative embedding. Utilizing the gene interaction network bolsters the clustering performance and aids in selecting functionally coherent genes, consequently enriching the clustering results. Extensive experiments have shown that G3DC offers high clustering accuracy with regard to agreement with true cell types, outperforming other leading single-cell clustering methods. In addition, G3DC selects biologically relevant genes that contribute to the clustering, providing insight into biological functionality that differentiates cell groups. |
format | Article |
id | doaj-art-ec860bb65c1541548b9e2b667e537c06 |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2024-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-ec860bb65c1541548b9e2b667e537c062025-02-03T11:53:25ZengTsinghua University PressBig Data Mining and Analytics2096-06542024-09-017380982710.26599/BDMA.2024.9020011G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq DataShuqing He0Jicong Fan1Tianwei Yu2Department of Statistics, University of Michigan, Ann Arbor, MI 48109, USASchool of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen 518172, China, and also with Shenzhen Research Institute of Big Data, Shenzhen 518172, ChinaSchool of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen 518172, China, and also with Warshel Institute for Computational Biology, Shenzhen 518172, ChinaSingle-cell RNA sequencing (scRNA-seq) technology measures the expression of thousands of genes at the cellular level. Analyzing single-cell transcriptome allows the identification of heterogeneous cell groups, cellular-level regulations, and the trajectory of cell development. An important aspect in the analyses of scRNA-seq data is the clustering of cells, which is hampered by issues, such as high dimensionality, cell type imbalance, redundancy, and dropout. Given cells of each type are functionally consistent, incorporating biological relations among genes may improve the clustering results. In light of this, we have developed a deep-embedded clustering method, G3DC. This method combines a graph regularization based on the pre-existing gene network and a feature selector based on the ℓ2,1-norm regularization, along with a reconstruction loss, to generate a discriminatory and informative embedding. Utilizing the gene interaction network bolsters the clustering performance and aids in selecting functionally coherent genes, consequently enriching the clustering results. Extensive experiments have shown that G3DC offers high clustering accuracy with regard to agreement with true cell types, outperforming other leading single-cell clustering methods. In addition, G3DC selects biologically relevant genes that contribute to the clustering, providing insight into biological functionality that differentiates cell groups.https://www.sciopen.com/article/10.26599/BDMA.2024.9020011gene graphsfeature selectiondeep learning |
spellingShingle | Shuqing He Jicong Fan Tianwei Yu G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data Big Data Mining and Analytics gene graphs feature selection deep learning |
title | G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data |
title_full | G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data |
title_fullStr | G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data |
title_full_unstemmed | G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data |
title_short | G3DC: A Gene-Graph-Guided Selective Deep Clustering Method for Single Cell RNA-seq Data |
title_sort | g3dc a gene graph guided selective deep clustering method for single cell rna seq data |
topic | gene graphs feature selection deep learning |
url | https://www.sciopen.com/article/10.26599/BDMA.2024.9020011 |
work_keys_str_mv | AT shuqinghe g3dcagenegraphguidedselectivedeepclusteringmethodforsinglecellrnaseqdata AT jicongfan g3dcagenegraphguidedselectivedeepclusteringmethodforsinglecellrnaseqdata AT tianweiyu g3dcagenegraphguidedselectivedeepclusteringmethodforsinglecellrnaseqdata |