Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets

Cell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In...

Full description

Saved in:
Bibliographic Details
Main Authors: Siqi Sun, Shweta Yadav, Mulini Pingili, Dan Chang, Jing Wang
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037025003216
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406299336343552
author Siqi Sun
Shweta Yadav
Mulini Pingili
Dan Chang
Jing Wang
author_facet Siqi Sun
Shweta Yadav
Mulini Pingili
Dan Chang
Jing Wang
author_sort Siqi Sun
collection DOAJ
description Cell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In this study, we developed tissue- and blood-derived CRMs using single-cell RNA sequencing (scRNA-seq) data from inflammatory bowel disease (IBD). Three publicly available blood-derived CRMs (IRIS, LM22, and ImmunoStates) were incorporated for benchmarking. Deconvolution performance was evaluated using both public bulk transcriptomic datasets and simulated pseudobulk samples by goodness-of-fit and cell fractions correlation. Two infliximab-treated bulk datasets were used to identify treatment-related cell types. In addition, lung adenocarcinoma (LUAD) single-cell and bulk transcriptomic datasets were also used for deconvolution evaluation. We found tissue-derived CRMs consistently outperformed blood-derived CRMs in deconvolving bulk tissue transcriptomes, exhibiting higher goodness-of-fit and more accurate cellular proportion estimates, particularly for immune and stromal cells. They also revealed more treatment-related cell types. In contrast, all CRMs performed similarly when applied to blood bulk transcriptomics. These trends also were shown in the LUAD datasets. Our results emphasize the importance of selecting appropriate CRMs for cell deconvolution in bulk tissue transcriptomes, particularly in immunology and oncology. Such considerations can be extended to encompass other disease implications. The R package (DeconvRef) for building user-defined CRMs is available at https://github.com/alohasiqi/DeconvRef
format Article
id doaj-art-7b50556c7c9a4c7aa21b032dca0b4b1a
institution Kabale University
issn 2001-0370
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj-art-7b50556c7c9a4c7aa21b032dca0b4b1a2025-08-20T03:36:26ZengElsevierComputational and Structural Biotechnology Journal2001-03702025-01-01273579358810.1016/j.csbj.2025.07.058Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasetsSiqi Sun0Shweta Yadav1Mulini Pingili2Dan Chang3Jing Wang4Correspondence to: Genomics Research Center, AbbVie, 200 Sidney StreetCambridge, MA 02139, United States.; Genomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesCell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In this study, we developed tissue- and blood-derived CRMs using single-cell RNA sequencing (scRNA-seq) data from inflammatory bowel disease (IBD). Three publicly available blood-derived CRMs (IRIS, LM22, and ImmunoStates) were incorporated for benchmarking. Deconvolution performance was evaluated using both public bulk transcriptomic datasets and simulated pseudobulk samples by goodness-of-fit and cell fractions correlation. Two infliximab-treated bulk datasets were used to identify treatment-related cell types. In addition, lung adenocarcinoma (LUAD) single-cell and bulk transcriptomic datasets were also used for deconvolution evaluation. We found tissue-derived CRMs consistently outperformed blood-derived CRMs in deconvolving bulk tissue transcriptomes, exhibiting higher goodness-of-fit and more accurate cellular proportion estimates, particularly for immune and stromal cells. They also revealed more treatment-related cell types. In contrast, all CRMs performed similarly when applied to blood bulk transcriptomics. These trends also were shown in the LUAD datasets. Our results emphasize the importance of selecting appropriate CRMs for cell deconvolution in bulk tissue transcriptomes, particularly in immunology and oncology. Such considerations can be extended to encompass other disease implications. The R package (DeconvRef) for building user-defined CRMs is available at https://github.com/alohasiqi/DeconvRefhttp://www.sciencedirect.com/science/article/pii/S2001037025003216Cell deconvolutionCell reference matricesInflammatory bowel diseaseLung adenocarcinoma
spellingShingle Siqi Sun
Shweta Yadav
Mulini Pingili
Dan Chang
Jing Wang
Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
Computational and Structural Biotechnology Journal
Cell deconvolution
Cell reference matrices
Inflammatory bowel disease
Lung adenocarcinoma
title Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
title_full Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
title_fullStr Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
title_full_unstemmed Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
title_short Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
title_sort estimating the effect of tissue and blood derived cell reference matrices on deconvolving bulk transcriptomic datasets
topic Cell deconvolution
Cell reference matrices
Inflammatory bowel disease
Lung adenocarcinoma
url http://www.sciencedirect.com/science/article/pii/S2001037025003216
work_keys_str_mv AT siqisun estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets
AT shwetayadav estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets
AT mulinipingili estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets
AT danchang estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets
AT jingwang estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets