Reference-informed evaluation of batch correction for single-cell omics data with overcorrection awareness

Abstract Batch effect correction (BEC) is fundamental to integrate multiple single-cell RNA sequencing datasets, and its success is critical to empower in-depth interrogation for biological insights. However, no simple metric is available to evaluate BEC performance with sensitivity to data overcorr...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoyue Hu, He Li, Ming Chen, Junbin Qian, Hangjin Jiang
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Communications Biology
Online Access:https://doi.org/10.1038/s42003-025-07947-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Batch effect correction (BEC) is fundamental to integrate multiple single-cell RNA sequencing datasets, and its success is critical to empower in-depth interrogation for biological insights. However, no simple metric is available to evaluate BEC performance with sensitivity to data overcorrection, which erases true biological variations and leads to false biological discoveries. Here, we propose RBET, a reference-informed statistical framework for evaluating the success of BEC. Using extensive simulations and six real data examples including scRNA-seq and scATAC-seq datasets with different numbers of batches, batch effect sizes and numbers of cell types, we demonstrate that RBET evaluates the performance of BEC methods more fairly with biologically meaningful insights from data, while other methods may lead to false results. Moreover, RBET is computationally efficient, sensitive to overcorrection and robust to large batch effect sizes. Thus, RBET provides a robust guideline on selecting case-specific BEC method, and the concept of RBET is extendable to other modalities.
ISSN:2399-3642