Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets
Cell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-01-01
|
| Series: | Computational and Structural Biotechnology Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2001037025003216 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849406299336343552 |
|---|---|
| author | Siqi Sun Shweta Yadav Mulini Pingili Dan Chang Jing Wang |
| author_facet | Siqi Sun Shweta Yadav Mulini Pingili Dan Chang Jing Wang |
| author_sort | Siqi Sun |
| collection | DOAJ |
| description | Cell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In this study, we developed tissue- and blood-derived CRMs using single-cell RNA sequencing (scRNA-seq) data from inflammatory bowel disease (IBD). Three publicly available blood-derived CRMs (IRIS, LM22, and ImmunoStates) were incorporated for benchmarking. Deconvolution performance was evaluated using both public bulk transcriptomic datasets and simulated pseudobulk samples by goodness-of-fit and cell fractions correlation. Two infliximab-treated bulk datasets were used to identify treatment-related cell types. In addition, lung adenocarcinoma (LUAD) single-cell and bulk transcriptomic datasets were also used for deconvolution evaluation. We found tissue-derived CRMs consistently outperformed blood-derived CRMs in deconvolving bulk tissue transcriptomes, exhibiting higher goodness-of-fit and more accurate cellular proportion estimates, particularly for immune and stromal cells. They also revealed more treatment-related cell types. In contrast, all CRMs performed similarly when applied to blood bulk transcriptomics. These trends also were shown in the LUAD datasets. Our results emphasize the importance of selecting appropriate CRMs for cell deconvolution in bulk tissue transcriptomes, particularly in immunology and oncology. Such considerations can be extended to encompass other disease implications. The R package (DeconvRef) for building user-defined CRMs is available at https://github.com/alohasiqi/DeconvRef |
| format | Article |
| id | doaj-art-7b50556c7c9a4c7aa21b032dca0b4b1a |
| institution | Kabale University |
| issn | 2001-0370 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Computational and Structural Biotechnology Journal |
| spelling | doaj-art-7b50556c7c9a4c7aa21b032dca0b4b1a2025-08-20T03:36:26ZengElsevierComputational and Structural Biotechnology Journal2001-03702025-01-01273579358810.1016/j.csbj.2025.07.058Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasetsSiqi Sun0Shweta Yadav1Mulini Pingili2Dan Chang3Jing Wang4Correspondence to: Genomics Research Center, AbbVie, 200 Sidney StreetCambridge, MA 02139, United States.; Genomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesGenomics Research Center, AbbVie, 200 Sidney Street, Cambridge, MA 02139, United StatesCell deconvolution is a widely used method to characterize the composition of the mixed cell population in bulk transcriptomic datasets. While tissue- and blood-derived cell reference matrices (CRMs) are commonly used, their impact on deconvolution accuracy has yet to be systematically evaluated. In this study, we developed tissue- and blood-derived CRMs using single-cell RNA sequencing (scRNA-seq) data from inflammatory bowel disease (IBD). Three publicly available blood-derived CRMs (IRIS, LM22, and ImmunoStates) were incorporated for benchmarking. Deconvolution performance was evaluated using both public bulk transcriptomic datasets and simulated pseudobulk samples by goodness-of-fit and cell fractions correlation. Two infliximab-treated bulk datasets were used to identify treatment-related cell types. In addition, lung adenocarcinoma (LUAD) single-cell and bulk transcriptomic datasets were also used for deconvolution evaluation. We found tissue-derived CRMs consistently outperformed blood-derived CRMs in deconvolving bulk tissue transcriptomes, exhibiting higher goodness-of-fit and more accurate cellular proportion estimates, particularly for immune and stromal cells. They also revealed more treatment-related cell types. In contrast, all CRMs performed similarly when applied to blood bulk transcriptomics. These trends also were shown in the LUAD datasets. Our results emphasize the importance of selecting appropriate CRMs for cell deconvolution in bulk tissue transcriptomes, particularly in immunology and oncology. Such considerations can be extended to encompass other disease implications. The R package (DeconvRef) for building user-defined CRMs is available at https://github.com/alohasiqi/DeconvRefhttp://www.sciencedirect.com/science/article/pii/S2001037025003216Cell deconvolutionCell reference matricesInflammatory bowel diseaseLung adenocarcinoma |
| spellingShingle | Siqi Sun Shweta Yadav Mulini Pingili Dan Chang Jing Wang Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets Computational and Structural Biotechnology Journal Cell deconvolution Cell reference matrices Inflammatory bowel disease Lung adenocarcinoma |
| title | Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| title_full | Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| title_fullStr | Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| title_full_unstemmed | Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| title_short | Estimating the effect of tissue- and blood-derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| title_sort | estimating the effect of tissue and blood derived cell reference matrices on deconvolving bulk transcriptomic datasets |
| topic | Cell deconvolution Cell reference matrices Inflammatory bowel disease Lung adenocarcinoma |
| url | http://www.sciencedirect.com/science/article/pii/S2001037025003216 |
| work_keys_str_mv | AT siqisun estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets AT shwetayadav estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets AT mulinipingili estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets AT danchang estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets AT jingwang estimatingtheeffectoftissueandbloodderivedcellreferencematricesondeconvolvingbulktranscriptomicdatasets |