Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners

Abstract Background Metagenome-assembled viral genomes have significantly advanced the discovery and characterization of the human gut virome. However, we lack a comparative assessment of assembly tools on the efficacy of viral genome identification, particularly across next-generation sequencing (N...

Full description

Saved in:
Bibliographic Details
Main Authors: Huarui Wang, Chuqing Sun, Yun Li, Jingchao Chen, Xing-Ming Zhao, Wei-Hua Chen
Format: Article
Language:English
Published: BMC 2024-12-01
Series:Microbiome
Online Access:https://doi.org/10.1186/s40168-024-01981-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850134341476155392
author Huarui Wang
Chuqing Sun
Yun Li
Jingchao Chen
Xing-Ming Zhao
Wei-Hua Chen
author_facet Huarui Wang
Chuqing Sun
Yun Li
Jingchao Chen
Xing-Ming Zhao
Wei-Hua Chen
author_sort Huarui Wang
collection DOAJ
description Abstract Background Metagenome-assembled viral genomes have significantly advanced the discovery and characterization of the human gut virome. However, we lack a comparative assessment of assembly tools on the efficacy of viral genome identification, particularly across next-generation sequencing (NGS) and third-generation sequencing (TGS) data. Results We evaluated the efficiency of NGS, TGS, and hybrid assemblers for viral genome discovery using 95 viral-like particle (VLP)-enriched fecal samples sequenced on both Illumina and PacBio platforms. MEGAHIT, metaFlye, and hybridSPAdes emerged as the optimal choices for NGS, TGS, and hybrid datasets, respectively. Notably, these assemblers recovered distinct viral genomes, demonstrating a remarkable degree of complementarity. By combining individual assembler results, we expanded the total number of nonredundant high-quality viral genomes by 4.83 ~ 21.7-fold compared to individual assemblers. Among them, viral genomes from NGS and TGS data have the least overlap, indicating the impact of data type on viral genome recovery. We also evaluated four binning methods, finding that CONCOCT incorporated more unrelated contigs into the same bins, while MetaBAT2, AVAMB, and vRhyme balanced inclusiveness and taxonomic consistency within bins. Conclusions Our findings highlight the challenges in metagenome-driven viral discovery, underscoring tool limitations. We advocate for combined use of multiple assemblers and sequencing technologies when feasible and highlight the urgent need for specialized tools tailored to gut virome assembly. This study contributes essential insights for advancing viral genome research in the context of gut metagenomics. Video Abstract
format Article
id doaj-art-43149ed026b74696a9e404ea43da1c7e
institution OA Journals
issn 2049-2618
language English
publishDate 2024-12-01
publisher BMC
record_format Article
series Microbiome
spelling doaj-art-43149ed026b74696a9e404ea43da1c7e2025-08-20T02:31:44ZengBMCMicrobiome2049-26182024-12-0112111410.1186/s40168-024-01981-zComplementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binnersHuarui Wang0Chuqing Sun1Yun Li2Jingchao Chen3Xing-Ming Zhao4Wei-Hua Chen5Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Department of Bioinformatics and Systems Biology, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and TechnologyKey Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Department of Bioinformatics and Systems Biology, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and TechnologyKey Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Department of Bioinformatics and Systems Biology, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and TechnologyKey Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Department of Bioinformatics and Systems Biology, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and TechnologyDepartment of Neurology, Institute of Science and Technology for Brain-Inspired Intelligence, Zhongshan Hospitaland, Fudan University Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Department of Bioinformatics and Systems Biology, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and TechnologyAbstract Background Metagenome-assembled viral genomes have significantly advanced the discovery and characterization of the human gut virome. However, we lack a comparative assessment of assembly tools on the efficacy of viral genome identification, particularly across next-generation sequencing (NGS) and third-generation sequencing (TGS) data. Results We evaluated the efficiency of NGS, TGS, and hybrid assemblers for viral genome discovery using 95 viral-like particle (VLP)-enriched fecal samples sequenced on both Illumina and PacBio platforms. MEGAHIT, metaFlye, and hybridSPAdes emerged as the optimal choices for NGS, TGS, and hybrid datasets, respectively. Notably, these assemblers recovered distinct viral genomes, demonstrating a remarkable degree of complementarity. By combining individual assembler results, we expanded the total number of nonredundant high-quality viral genomes by 4.83 ~ 21.7-fold compared to individual assemblers. Among them, viral genomes from NGS and TGS data have the least overlap, indicating the impact of data type on viral genome recovery. We also evaluated four binning methods, finding that CONCOCT incorporated more unrelated contigs into the same bins, while MetaBAT2, AVAMB, and vRhyme balanced inclusiveness and taxonomic consistency within bins. Conclusions Our findings highlight the challenges in metagenome-driven viral discovery, underscoring tool limitations. We advocate for combined use of multiple assemblers and sequencing technologies when feasible and highlight the urgent need for specialized tools tailored to gut virome assembly. This study contributes essential insights for advancing viral genome research in the context of gut metagenomics. Video Abstracthttps://doi.org/10.1186/s40168-024-01981-z
spellingShingle Huarui Wang
Chuqing Sun
Yun Li
Jingchao Chen
Xing-Ming Zhao
Wei-Hua Chen
Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
Microbiome
title Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
title_full Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
title_fullStr Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
title_full_unstemmed Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
title_short Complementary insights into gut viral genomes: a comparative benchmark of short- and long-read metagenomes using diverse assemblers and binners
title_sort complementary insights into gut viral genomes a comparative benchmark of short and long read metagenomes using diverse assemblers and binners
url https://doi.org/10.1186/s40168-024-01981-z
work_keys_str_mv AT huaruiwang complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners
AT chuqingsun complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners
AT yunli complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners
AT jingchaochen complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners
AT xingmingzhao complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners
AT weihuachen complementaryinsightsintogutviralgenomesacomparativebenchmarkofshortandlongreadmetagenomesusingdiverseassemblersandbinners