A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA
16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds...
Saved in:
| Main Authors: | , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-03-01
|
| Series: | Frontiers in Microbiology |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850066309747834880 |
|---|---|
| author | Min Wang Tingting Yuan Tingting Yuan Jiali Chen Jiali Chen Jing Yang Ji Pu Wenchao Lin Wenchao Lin Kui Dong Luqing Zhang Jiale Yuan Han Zheng Yamin Sun Yamin Sun Jianguo Xu Jianguo Xu |
| author_facet | Min Wang Tingting Yuan Tingting Yuan Jiali Chen Jiali Chen Jing Yang Ji Pu Wenchao Lin Wenchao Lin Kui Dong Luqing Zhang Jiale Yuan Han Zheng Yamin Sun Yamin Sun Jianguo Xu Jianguo Xu |
| author_sort | Min Wang |
| collection | DOAJ |
| description | 16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds among species. To address this, our study integrated data from SILVA, NCBI, and LPSN databases, extracting V3-V4 region sequences and supplementing them with 16S rRNA sequences from 1,082 human gut samples. This resulted in a non-redundant amplicon sequence variants (ASVs) database specific to the V3-V4 regions (positions 341–806). Utilizing this database, we identified flexible classification thresholds for 674 families, 3,661 genera, and 15,735 species, finding clear thresholds for 87.09% of families and 98.38% of genera. For the 896 most common human gut species, we established precise taxonomic thresholds. To leverage these findings, we developed the asvtax pipeline, which applies flexible thresholds for more accurate taxonomic classification, notably improving the identification of new ASVs. The asvtax pipeline not only enhances the precision of species-level classification but also provides a robust framework for analyzing complex microbial communities, facilitating more reliable ecological and functional interpretations in microbiome research. |
| format | Article |
| id | doaj-art-de88e3e367e047f8ab564eaeeb60d415 |
| institution | DOAJ |
| issn | 1664-302X |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Microbiology |
| spelling | doaj-art-de88e3e367e047f8ab564eaeeb60d4152025-08-20T02:48:46ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2025-03-011610.3389/fmicb.2025.15531241553124A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNAMin Wang0Tingting Yuan1Tingting Yuan2Jiali Chen3Jiali Chen4Jing Yang5Ji Pu6Wenchao Lin7Wenchao Lin8Kui Dong9Luqing Zhang10Jiale Yuan11Han Zheng12Yamin Sun13Yamin Sun14Jianguo Xu15Jianguo Xu16National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaUniteomics Tianjin Biotechnology Co., Ltd., Tianjin, ChinaBeijing Institute of Infectious Diseases, Beijing, ChinaDepartment of Epidemiology, School of Public Health, Shanxi Medical University, Taiyuan, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, ChinaBeijing Institute of Infectious Diseases, Beijing, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing, ChinaSchool of Medicine, Research Institute of Public Health, Nankai University, Tianjin, ChinaNational Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China16S rRNA gene sequencing is pivotal for identifying bacterial species in microbiome studies, especially using the V3-V4 hypervariable regions. A fixed 98.5% similarity threshold is often applied for species-level identification, but this approach can cause misclassification due to varying thresholds among species. To address this, our study integrated data from SILVA, NCBI, and LPSN databases, extracting V3-V4 region sequences and supplementing them with 16S rRNA sequences from 1,082 human gut samples. This resulted in a non-redundant amplicon sequence variants (ASVs) database specific to the V3-V4 regions (positions 341–806). Utilizing this database, we identified flexible classification thresholds for 674 families, 3,661 genera, and 15,735 species, finding clear thresholds for 87.09% of families and 98.38% of genera. For the 896 most common human gut species, we established precise taxonomic thresholds. To leverage these findings, we developed the asvtax pipeline, which applies flexible thresholds for more accurate taxonomic classification, notably improving the identification of new ASVs. The asvtax pipeline not only enhances the precision of species-level classification but also provides a robust framework for analyzing complex microbial communities, facilitating more reliable ecological and functional interpretations in microbiome research.https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full16S rRNAmicrobiotaspecies-level identificationtaxonomic thresholdsdatabase abbreviations |
| spellingShingle | Min Wang Tingting Yuan Tingting Yuan Jiali Chen Jiali Chen Jing Yang Ji Pu Wenchao Lin Wenchao Lin Kui Dong Luqing Zhang Jiale Yuan Han Zheng Yamin Sun Yamin Sun Jianguo Xu Jianguo Xu A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA Frontiers in Microbiology 16S rRNA microbiota species-level identification taxonomic thresholds database abbreviations |
| title | A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA |
| title_full | A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA |
| title_fullStr | A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA |
| title_full_unstemmed | A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA |
| title_short | A species-level identification pipeline for human gut microbiota based on the V3-V4 regions of 16S rRNA |
| title_sort | species level identification pipeline for human gut microbiota based on the v3 v4 regions of 16s rrna |
| topic | 16S rRNA microbiota species-level identification taxonomic thresholds database abbreviations |
| url | https://www.frontiersin.org/articles/10.3389/fmicb.2025.1553124/full |
| work_keys_str_mv | AT minwang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT tingtingyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT tingtingyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialichen aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialichen aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jingyang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jipu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT wenchaolin aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT wenchaolin aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT kuidong aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT luqingzhang aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialeyuan aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT hanzheng aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT yaminsun aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT yaminsun aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jianguoxu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jianguoxu aspecieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT minwang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT tingtingyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT tingtingyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialichen specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialichen specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jingyang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jipu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT wenchaolin specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT wenchaolin specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT kuidong specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT luqingzhang specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jialeyuan specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT hanzheng specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT yaminsun specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT yaminsun specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jianguoxu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna AT jianguoxu specieslevelidentificationpipelineforhumangutmicrobiotabasedonthev3v4regionsof16srrna |