Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness
Abstract Batch effect correction (BEC) is fundamental to integrate multiple single-cell RNA sequencing datasets, and its success is critical to empower in-depth interrogation for biological insights. However, no simple metric is available to evaluate BEC performance with sensitivity to data overcorr...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Communications Biology |
| Online Access: | https://doi.org/10.1038/s42003-025-07947-7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849392318714478592 |
|---|---|
| author | Xiaoyue Hu He Li Ming Chen Junbin Qian Hangjin Jiang |
| author_facet | Xiaoyue Hu He Li Ming Chen Junbin Qian Hangjin Jiang |
| author_sort | Xiaoyue Hu |
| collection | DOAJ |
| description | Abstract Batch effect correction (BEC) is fundamental to integrate multiple single-cell RNA sequencing datasets, and its success is critical to empower in-depth interrogation for biological insights. However, no simple metric is available to evaluate BEC performance with sensitivity to data overcorrection, which erases true biological variations and leads to false biological discoveries. Here, we propose RBET, a reference-informed statistical framework for evaluating the success of BEC. Using extensive simulations and six real data examples including scRNA-seq and scATAC-seq datasets with different numbers of batches, batch effect sizes and numbers of cell types, we demonstrate that RBET evaluates the performance of BEC methods more fairly with biologically meaningful insights from data, while other methods may lead to false results. Moreover, RBET is computationally efficient, sensitive to overcorrection and robust to large batch effect sizes. Thus, RBET provides a robust guideline on selecting case-specific BEC method, and the concept of RBET is extendable to other modalities. |
| format | Article |
| id | doaj-art-9a7e20682e834674b3ad30d6f26eef65 |
| institution | Kabale University |
| issn | 2399-3642 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Communications Biology |
| spelling | doaj-art-9a7e20682e834674b3ad30d6f26eef652025-08-20T03:40:47ZengNature PortfolioCommunications Biology2399-36422025-03-018111310.1038/s42003-025-07947-7Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awarenessXiaoyue Hu0He Li1Ming Chen2Junbin Qian3Hangjin Jiang4Center for Data Science, Zhejiang UniversityCenter for Data Science, Zhejiang UniversityCollege of Life Sciences, Zhejiang UniversityZhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women’s Hospital, Zhejiang University School of MedicineCenter for Data Science, Zhejiang UniversityAbstract Batch effect correction (BEC) is fundamental to integrate multiple single-cell RNA sequencing datasets, and its success is critical to empower in-depth interrogation for biological insights. However, no simple metric is available to evaluate BEC performance with sensitivity to data overcorrection, which erases true biological variations and leads to false biological discoveries. Here, we propose RBET, a reference-informed statistical framework for evaluating the success of BEC. Using extensive simulations and six real data examples including scRNA-seq and scATAC-seq datasets with different numbers of batches, batch effect sizes and numbers of cell types, we demonstrate that RBET evaluates the performance of BEC methods more fairly with biologically meaningful insights from data, while other methods may lead to false results. Moreover, RBET is computationally efficient, sensitive to overcorrection and robust to large batch effect sizes. Thus, RBET provides a robust guideline on selecting case-specific BEC method, and the concept of RBET is extendable to other modalities.https://doi.org/10.1038/s42003-025-07947-7 |
| spellingShingle | Xiaoyue Hu He Li Ming Chen Junbin Qian Hangjin Jiang Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness Communications Biology |
| title | Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness |
| title_full | Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness |
| title_fullStr | Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness |
| title_full_unstemmed | Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness |
| title_short | Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness |
| title_sort | reference informed evaluation of batch correction for single cell omics data with overcorrection awareness |
| url | https://doi.org/10.1038/s42003-025-07947-7 |
| work_keys_str_mv | AT xiaoyuehu referenceinformedevaluationofbatchcorrectionforsinglecellomicsdatawithovercorrectionawareness AT heli referenceinformedevaluationofbatchcorrectionforsinglecellomicsdatawithovercorrectionawareness AT mingchen referenceinformedevaluationofbatchcorrectionforsinglecellomicsdatawithovercorrectionawareness AT junbinqian referenceinformedevaluationofbatchcorrectionforsinglecellomicsdatawithovercorrectionawareness AT hangjinjiang referenceinformedevaluationofbatchcorrectionforsinglecellomicsdatawithovercorrectionawareness |