Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
Abstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding o...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-07734-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849226277820563456 |
|---|---|
| author | Leandro Di Gloria Lorenzo Casbarra Tommaso Lotti Matteo Ramazzotti |
| author_facet | Leandro Di Gloria Lorenzo Casbarra Tommaso Lotti Matteo Ramazzotti |
| author_sort | Leandro Di Gloria |
| collection | DOAJ |
| description | Abstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding of the microbial communities underpinning these technologies would enable their optimization, ultimately reducing costs and increasing efficiency. To support this research, we quantitatively compared classification methods differing in read length (raw reads, contigs and MAGs), overall search approach (Kaiju, Kraken2, RiboFrame and kMetaShot), as well as source databases to assess the classification performances at both the genus and species levels using an in silico-generated mock community designed to provide a simplified yet comprehensive representation of the complex microbial ecosystems found in AS and AGS. Particular attention was given to the misclassification of eukaryotes as bacteria and vice versa, as well as the occurrence of false negatives. Notably, Kaiju emerged as the most accurate classifier at both the genus and species levels, followed by RiboFrame and kMetaShot. However, our findings highlight the substantial risk of misclassification across all classifiers and databases, which could significantly hinder the advancement of these technologies by introducing noises and mistakes for key microbial clades. |
| format | Article |
| id | doaj-art-6aa4153279e549e0b5482c2fabfe9537 |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-6aa4153279e549e0b5482c2fabfe95372025-08-24T11:28:35ZengNature PortfolioScientific Reports2045-23222025-07-0115111310.1038/s41598-025-07734-8Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communitiesLeandro Di Gloria0Lorenzo Casbarra1Tommaso Lotti2Matteo Ramazzotti3Department of Experimental and Clinical Biomedical Sciences, University of FlorenceDepartment of Experimental and Clinical Biomedical Sciences, University of FlorenceDepartment of Civil and Environmental Engineering, University of FlorenceDepartment of Experimental and Clinical Biomedical Sciences, University of FlorenceAbstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding of the microbial communities underpinning these technologies would enable their optimization, ultimately reducing costs and increasing efficiency. To support this research, we quantitatively compared classification methods differing in read length (raw reads, contigs and MAGs), overall search approach (Kaiju, Kraken2, RiboFrame and kMetaShot), as well as source databases to assess the classification performances at both the genus and species levels using an in silico-generated mock community designed to provide a simplified yet comprehensive representation of the complex microbial ecosystems found in AS and AGS. Particular attention was given to the misclassification of eukaryotes as bacteria and vice versa, as well as the occurrence of false negatives. Notably, Kaiju emerged as the most accurate classifier at both the genus and species levels, followed by RiboFrame and kMetaShot. However, our findings highlight the substantial risk of misclassification across all classifiers and databases, which could significantly hinder the advancement of these technologies by introducing noises and mistakes for key microbial clades.https://doi.org/10.1038/s41598-025-07734-8WastewaterMicrobial communityClassificationsAerobic granular sludgeBenchmark |
| spellingShingle | Leandro Di Gloria Lorenzo Casbarra Tommaso Lotti Matteo Ramazzotti Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities Scientific Reports Wastewater Microbial community Classifications Aerobic granular sludge Benchmark |
| title | Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities |
| title_full | Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities |
| title_fullStr | Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities |
| title_full_unstemmed | Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities |
| title_short | Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities |
| title_sort | testing the limits of short reads metagenomic classifications programs in wastewater treating microbial communities |
| topic | Wastewater Microbial community Classifications Aerobic granular sludge Benchmark |
| url | https://doi.org/10.1038/s41598-025-07734-8 |
| work_keys_str_mv | AT leandrodigloria testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities AT lorenzocasbarra testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities AT tommasolotti testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities AT matteoramazzotti testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities |