A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA

16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds...

Full description

Saved in:
Bibliographic Details
Main Authors: Min Wang, Tingting Yuan, Jiali Chen, Jing Yang, Ji Pu, Wenchao Lin, Kui Dong, Luqing Zhang, Jiale Yuan, Han Zheng, Yamin Sun, Jianguo Xu
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-03-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850066309747834880
author Min Wang
Tingting Yuan
Tingting Yuan
Jiali Chen
Jiali Chen
Jing Yang
Ji Pu
Wenchao Lin
Wenchao Lin
Kui Dong
Luqing Zhang
Jiale Yuan
Han Zheng
Yamin Sun
Yamin Sun
Jianguo Xu
Jianguo Xu
author_facet Min Wang
Tingting Yuan
Tingting Yuan
Jiali Chen
Jiali Chen
Jing Yang
Ji Pu
Wenchao Lin
Wenchao Lin
Kui Dong
Luqing Zhang
Jiale Yuan
Han Zheng
Yamin Sun
Yamin Sun
Jianguo Xu
Jianguo Xu
author_sort Min Wang
collection DOAJ
description 16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds among species. To address this, our study integrated data from SILVA, NCBI, and LPSN databases, extracting V3-V4 region sequences and supplementing them with 16S rRNA sequences from 1,082 human gut samples. This resulted in a non-redundant amplicon sequence variants (ASVs) database specific to the V3-V4 regions (positions 341–806). Utilizing this database, we identified flexible classification thresholds for 674 families, 3,661 genera, and 15,735 species, finding clear thresholds for 87.09% of families and 98.38% of genera. For the 896 most common human gut species, we established precise taxonomic thresholds. To leverage these findings, we developed the asvtax pipeline, which applies flexible thresholds for more accurate taxonomic classification, notably improving the identification of new ASVs. The asvtax pipeline not only enhances the precision of species-level classification but also provides a robust framework for analyzing complex microbial communities, facilitating more reliable ecological and functional interpretations in microbiome research.
format Article
id doaj-art-de88e3e367e047f8ab564eaeeb60d415
institution DOAJ
issn 1664-302X
language English
publishDate 2025-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj-art-de88e3e367e047f8ab564eaeeb60d4152025-08-20T02:48:46ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2025-03-011610.3389/fmicb.2025.15531241553124A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNAMin Wang0Tingting Yuan1Tingting Yuan2Jiali Chen3Jiali Chen4Jing Yang5Ji Pu6Wenchao Lin7Wenchao Lin8Kui Dong9Luqing Zhang10Jiale Yuan11Han Zheng12Yamin Sun13Yamin Sun14Jianguo Xu15Jianguo Xu16National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaUniteomics Tianjin Biotechnology Co., Ltd., Tianjin, ChinaBeijing Institute of Infectious Diseases, Beijing, ChinaDepartment of Epidemiology, School of Public Health, Shanxi Medical University, Taiyuan, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaBeijing Institute of Infectious Diseases, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds among species. To address this, our study integrated data from SILVA, NCBI, and LPSN databases, extracting V3-V4 region sequences and supplementing them with 16S rRNA sequences from 1,082 human gut samples. This resulted in a non-redundant amplicon sequence variants (ASVs) database specific to the V3-V4 regions (positions 341–806). Utilizing this database, we identified flexible classification thresholds for 674 families, 3,661 genera, and 15,735 species, finding clear thresholds for 87.09% of families and 98.38% of genera. For the 896 most common human gut species, we established precise taxonomic thresholds. To leverage these findings, we developed the asvtax pipeline, which applies flexible thresholds for more accurate taxonomic classification, notably improving the identification of new ASVs. The asvtax pipeline not only enhances the precision of species-level classification but also provides a robust framework for analyzing complex microbial communities, facilitating more reliable ecological and functional interpretations in microbiome research.https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full16S rRNAmicrobiotaspecies-level identificationtaxonomic thresholdsdatabase abbreviations
spellingShingle Min Wang
Tingting Yuan
Tingting Yuan
Jiali Chen
Jiali Chen
Jing Yang
Ji Pu
Wenchao Lin
Wenchao Lin
Kui Dong
Luqing Zhang
Jiale Yuan
Han Zheng
Yamin Sun
Yamin Sun
Jianguo Xu
Jianguo Xu
A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
Frontiers in Microbiology
16S rRNA
microbiota
species-level identification
taxonomic thresholds
database abbreviations
title A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
title_full A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
title_fullStr A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
title_full_unstemmed A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
title_short A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
title_sort species level identification pipeline for human gut microbiota based on the v3 v4 regions of 16s rrna
topic 16S rRNA
microbiota
species-level identification
taxonomic thresholds
database abbreviations
url https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full
work_keys_str_mv AT minwang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT tingtingyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT tingtingyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialichen aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialichen aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jingyang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jipu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT wenchaolin aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT wenchaolin aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT kuidong aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT luqingzhang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialeyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT hanzheng aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT yaminsun aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT yaminsun aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jianguoxu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jianguoxu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT minwang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT tingtingyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT tingtingyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialichen specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialichen specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jingyang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jipu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT wenchaolin specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT wenchaolin specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT kuidong specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT luqingzhang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jialeyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT hanzheng specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT yaminsun specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT yaminsun specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jianguoxu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna
AT jianguoxu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna