TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.

In microbiome research, data sparsity represents a prevalent and formidable challenge. Sparse data not only compromises the accuracy of statistical analyses but also conceals critical biological relationships, thereby undermining the reliability of the conclusions. To tackle this issue, we introduce...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinyu Han, Kai Song
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-03-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012858
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849732096135790592
author Xinyu Han
Kai Song
author_facet Xinyu Han
Kai Song
author_sort Xinyu Han
collection DOAJ
description In microbiome research, data sparsity represents a prevalent and formidable challenge. Sparse data not only compromises the accuracy of statistical analyses but also conceals critical biological relationships, thereby undermining the reliability of the conclusions. To tackle this issue, we introduce a machine learning approach for microbiome data imputation, termed TphPMF. This technique leverages Probabilistic Matrix Factorization, incorporating phylogenetic relationships among microorganisms to establish Bayesian prior distributions. These priors facilitate posterior predictions of potential non-biological zeros. We demonstrate that TphPMF outperforms existing microbiome data imputation methods in accurately recovering missing taxon abundances. Furthermore, TphPMF enhances the efficacy of certain differential abundance analysis methods in detecting differentially abundant (DA) taxa, particularly showing advantages when used in conjunction with DESeq2-phyloseq. Additionally, TphPMF significantly improves the precision of cross-predicting disease conditions in microbiome datasets pertaining to type 2 diabetes and colorectal cancer.
format Article
id doaj-art-a50eb5639f1b42e08791c8d65ca09d5e
institution DOAJ
issn 1553-734X
1553-7358
language English
publishDate 2025-03-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-a50eb5639f1b42e08791c8d65ca09d5e2025-08-20T03:08:20ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-03-01213e101285810.1371/journal.pcbi.1012858TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.Xinyu HanKai SongIn microbiome research, data sparsity represents a prevalent and formidable challenge. Sparse data not only compromises the accuracy of statistical analyses but also conceals critical biological relationships, thereby undermining the reliability of the conclusions. To tackle this issue, we introduce a machine learning approach for microbiome data imputation, termed TphPMF. This technique leverages Probabilistic Matrix Factorization, incorporating phylogenetic relationships among microorganisms to establish Bayesian prior distributions. These priors facilitate posterior predictions of potential non-biological zeros. We demonstrate that TphPMF outperforms existing microbiome data imputation methods in accurately recovering missing taxon abundances. Furthermore, TphPMF enhances the efficacy of certain differential abundance analysis methods in detecting differentially abundant (DA) taxa, particularly showing advantages when used in conjunction with DESeq2-phyloseq. Additionally, TphPMF significantly improves the precision of cross-predicting disease conditions in microbiome datasets pertaining to type 2 diabetes and colorectal cancer.https://doi.org/10.1371/journal.pcbi.1012858
spellingShingle Xinyu Han
Kai Song
TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
PLoS Computational Biology
title TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
title_full TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
title_fullStr TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
title_full_unstemmed TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
title_short TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.
title_sort tphpmf a microbiome data imputation method using hierarchical bayesian probabilistic matrix factorization
url https://doi.org/10.1371/journal.pcbi.1012858
work_keys_str_mv AT xinyuhan tphpmfamicrobiomedataimputationmethodusinghierarchicalbayesianprobabilisticmatrixfactorization
AT kaisong tphpmfamicrobiomedataimputationmethodusinghierarchicalbayesianprobabilisticmatrixfactorization