scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation
Abstract Clustering analysis is a fundamental step in scRNA-seq data analysis. However, its reliability is compromised by clustering inconsistency among trials due to stochastic processes in clustering algorithms. Despite efforts to obtain reliable and consensus clustering, existing methods cannot b...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-60702-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849768749049053184 |
|---|---|
| author | Hyun Kim Issac Park Jong-Eun Park Jong Kyoung Kim Minseok Seo Jae Kyoung Kim |
| author_facet | Hyun Kim Issac Park Jong-Eun Park Jong Kyoung Kim Minseok Seo Jae Kyoung Kim |
| author_sort | Hyun Kim |
| collection | DOAJ |
| description | Abstract Clustering analysis is a fundamental step in scRNA-seq data analysis. However, its reliability is compromised by clustering inconsistency among trials due to stochastic processes in clustering algorithms. Despite efforts to obtain reliable and consensus clustering, existing methods cannot be applied to large scRNA-seq datasets due to high computational costs. Here, we develop the single-cell Inconsistency Clustering Estimator (scICE) to evaluate clustering consistency and provide consistent clustering results, achieving up to a 30-fold improvement in speed compared to conventional consensus clustering-based methods, such as multiK and chooseR. Application of scICE to 48 real and simulated scRNA-seq datasets, some with over 10,000 cells, successfully identifies all consistent clustering results, substantially narrowing the number of clusters to explore. By enabling the focus on a narrower set of more reliable candidate clusters, users can greatly reduce computational burden while generating more robust results. |
| format | Article |
| id | doaj-art-9eca166306434d4d85ab330fc9bf18d7 |
| institution | DOAJ |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-9eca166306434d4d85ab330fc9bf18d72025-08-20T03:03:41ZengNature PortfolioNature Communications2041-17232025-07-0116111210.1038/s41467-025-60702-8scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluationHyun Kim0Issac Park1Jong-Eun Park2Jong Kyoung Kim3Minseok Seo4Jae Kyoung Kim5Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic ScienceDepartment of Mathematics, Pusan National UniversityGraduate School of Medical Science and Engineering, KAISTDepartment of Life Sciences, Pohang University of Science and Technology (POSTECH)Department of Computer and Information Science, Korea UniversityBiomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic ScienceAbstract Clustering analysis is a fundamental step in scRNA-seq data analysis. However, its reliability is compromised by clustering inconsistency among trials due to stochastic processes in clustering algorithms. Despite efforts to obtain reliable and consensus clustering, existing methods cannot be applied to large scRNA-seq datasets due to high computational costs. Here, we develop the single-cell Inconsistency Clustering Estimator (scICE) to evaluate clustering consistency and provide consistent clustering results, achieving up to a 30-fold improvement in speed compared to conventional consensus clustering-based methods, such as multiK and chooseR. Application of scICE to 48 real and simulated scRNA-seq datasets, some with over 10,000 cells, successfully identifies all consistent clustering results, substantially narrowing the number of clusters to explore. By enabling the focus on a narrower set of more reliable candidate clusters, users can greatly reduce computational burden while generating more robust results.https://doi.org/10.1038/s41467-025-60702-8 |
| spellingShingle | Hyun Kim Issac Park Jong-Eun Park Jong Kyoung Kim Minseok Seo Jae Kyoung Kim scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation Nature Communications |
| title | scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation |
| title_full | scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation |
| title_fullStr | scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation |
| title_full_unstemmed | scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation |
| title_short | scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation |
| title_sort | scice enhancing clustering reliability and efficiency of scrna seq data with multi cluster label consistency evaluation |
| url | https://doi.org/10.1038/s41467-025-60702-8 |
| work_keys_str_mv | AT hyunkim sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation AT issacpark sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation AT jongeunpark sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation AT jongkyoungkim sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation AT minseokseo sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation AT jaekyoungkim sciceenhancingclusteringreliabilityandefficiencyofscrnaseqdatawithmulticlusterlabelconsistencyevaluation |