Group-wise normalization in differential abundance analysis of microbiome samples

Abstract Background A key challenge in differential abundance analysis (DAA) of microbial sequencing data is that the counts for each sample are compositional, resulting in potentially biased comparisons of the absolute abundance across study groups. Normalization-based DAA methods rely on external...

Full description

Saved in:
Bibliographic Details
Main Authors: Dylan Clark-Boucher, Brent A. Coull, Harrison T. Reeder, Fenglei Wang, Qi Sun, Jacqueline R. Starr, Kyu Ha Lee
Format: Article
Language:English
Published: BMC 2025-07-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06235-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849738388353056768
author Dylan Clark-Boucher
Brent A. Coull
Harrison T. Reeder
Fenglei Wang
Qi Sun
Jacqueline R. Starr
Kyu Ha Lee
author_facet Dylan Clark-Boucher
Brent A. Coull
Harrison T. Reeder
Fenglei Wang
Qi Sun
Jacqueline R. Starr
Kyu Ha Lee
author_sort Dylan Clark-Boucher
collection DOAJ
description Abstract Background A key challenge in differential abundance analysis (DAA) of microbial sequencing data is that the counts for each sample are compositional, resulting in potentially biased comparisons of the absolute abundance across study groups. Normalization-based DAA methods rely on external normalization factors that account for compositionality by standardizing the counts onto a common numerical scale. However, existing normalization methods have struggled to maintain the false discovery rate in settings where the variance or compositional bias is large. This article proposes a novel framework for normalization that can reduce bias in DAA by re-conceptualizing normalization as a group-level task. We present two new normalization methods within the group-wise framework: group-wise relative log expression (G-RLE) and fold-truncated sum scaling (FTSS). Results G-RLE and FTSS achieve higher statistical power for identifying differentially abundant taxa than existing methods in model-based and synthetic data simulation settings. The two novel methods also maintain the false discovery rate in challenging scenarios where existing methods suffer. The best results are obtained from using FTSS normalization with the DAA method MetagenomeSeq. Conclusion Compared with other methods for normalizing compositional sequence count data prior to DAA, the proposed group-level normalization frameworks offer more robust statistical inference. With a solid mathematical foundation, validated performance in numerical studies, and publicly available software, these new methods can help improve rigor and reproducibility in microbiome research.
format Article
id doaj-art-ec81b612eb7248f2b0152ed4ec5c3389
institution DOAJ
issn 1471-2105
language English
publishDate 2025-07-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-ec81b612eb7248f2b0152ed4ec5c33892025-08-20T03:06:36ZengBMCBMC Bioinformatics1471-21052025-07-0126111710.1186/s12859-025-06235-9Group-wise normalization in differential abundance analysis of microbiome samplesDylan Clark-Boucher0Brent A. Coull1Harrison T. Reeder2Fenglei Wang3Qi Sun4Jacqueline R. Starr5Kyu Ha Lee6Department of Biostatistics, Harvard TH Chan School of Public HealthDepartment of Biostatistics, Harvard TH Chan School of Public HealthBiostatistics, Massachusetts General HospitalDepartment of Nutrition, Harvard TH Chan School of Public HealthDepartment of Nutrition, Harvard TH Chan School of Public HealthChanning Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical SchoolDepartment of Biostatistics, Harvard TH Chan School of Public HealthAbstract Background A key challenge in differential abundance analysis (DAA) of microbial sequencing data is that the counts for each sample are compositional, resulting in potentially biased comparisons of the absolute abundance across study groups. Normalization-based DAA methods rely on external normalization factors that account for compositionality by standardizing the counts onto a common numerical scale. However, existing normalization methods have struggled to maintain the false discovery rate in settings where the variance or compositional bias is large. This article proposes a novel framework for normalization that can reduce bias in DAA by re-conceptualizing normalization as a group-level task. We present two new normalization methods within the group-wise framework: group-wise relative log expression (G-RLE) and fold-truncated sum scaling (FTSS). Results G-RLE and FTSS achieve higher statistical power for identifying differentially abundant taxa than existing methods in model-based and synthetic data simulation settings. The two novel methods also maintain the false discovery rate in challenging scenarios where existing methods suffer. The best results are obtained from using FTSS normalization with the DAA method MetagenomeSeq. Conclusion Compared with other methods for normalizing compositional sequence count data prior to DAA, the proposed group-level normalization frameworks offer more robust statistical inference. With a solid mathematical foundation, validated performance in numerical studies, and publicly available software, these new methods can help improve rigor and reproducibility in microbiome research.https://doi.org/10.1186/s12859-025-06235-9MicrobiomeNormalizationCompositional dataDifferential abundance analysis
spellingShingle Dylan Clark-Boucher
Brent A. Coull
Harrison T. Reeder
Fenglei Wang
Qi Sun
Jacqueline R. Starr
Kyu Ha Lee
Group-wise normalization in differential abundance analysis of microbiome samples
BMC Bioinformatics
Microbiome
Normalization
Compositional data
Differential abundance analysis
title Group-wise normalization in differential abundance analysis of microbiome samples
title_full Group-wise normalization in differential abundance analysis of microbiome samples
title_fullStr Group-wise normalization in differential abundance analysis of microbiome samples
title_full_unstemmed Group-wise normalization in differential abundance analysis of microbiome samples
title_short Group-wise normalization in differential abundance analysis of microbiome samples
title_sort group wise normalization in differential abundance analysis of microbiome samples
topic Microbiome
Normalization
Compositional data
Differential abundance analysis
url https://doi.org/10.1186/s12859-025-06235-9
work_keys_str_mv AT dylanclarkboucher groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT brentacoull groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT harrisontreeder groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT fengleiwang groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT qisun groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT jacquelinerstarr groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples
AT kyuhalee groupwisenormalizationindifferentialabundanceanalysisofmicrobiomesamples