Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities

Abstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding o...

Full description

Saved in:
Bibliographic Details
Main Authors: Leandro Di Gloria, Lorenzo Casbarra, Tommaso Lotti, Matteo Ramazzotti
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-07734-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226277820563456
author Leandro Di Gloria
Lorenzo Casbarra
Tommaso Lotti
Matteo Ramazzotti
author_facet Leandro Di Gloria
Lorenzo Casbarra
Tommaso Lotti
Matteo Ramazzotti
author_sort Leandro Di Gloria
collection DOAJ
description Abstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding of the microbial communities underpinning these technologies would enable their optimization, ultimately reducing costs and increasing efficiency. To support this research, we quantitatively compared classification methods differing in read length (raw reads, contigs and MAGs), overall search approach (Kaiju, Kraken2, RiboFrame and kMetaShot), as well as source databases to assess the classification performances at both the genus and species levels using an in silico-generated mock community designed to provide a simplified yet comprehensive representation of the complex microbial ecosystems found in AS and AGS. Particular attention was given to the misclassification of eukaryotes as bacteria and vice versa, as well as the occurrence of false negatives. Notably, Kaiju emerged as the most accurate classifier at both the genus and species levels, followed by RiboFrame and kMetaShot. However, our findings highlight the substantial risk of misclassification across all classifiers and databases, which could significantly hinder the advancement of these technologies by introducing noises and mistakes for key microbial clades.
format Article
id doaj-art-6aa4153279e549e0b5482c2fabfe9537
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-6aa4153279e549e0b5482c2fabfe95372025-08-24T11:28:35ZengNature PortfolioScientific Reports2045-23222025-07-0115111310.1038/s41598-025-07734-8Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communitiesLeandro Di Gloria0Lorenzo Casbarra1Tommaso Lotti2Matteo Ramazzotti3Department of Experimental and Clinical Biomedical Sciences, University of FlorenceDepartment of Experimental and Clinical Biomedical Sciences, University of FlorenceDepartment of Civil and Environmental Engineering, University of FlorenceDepartment of Experimental and Clinical Biomedical Sciences, University of FlorenceAbstract Biological wastewater treatment processes, such as activated sludge (AS) and aerobic granular sludge (AGS), have proven to be crucial systems for achieving both efficient waste purification and the recovery of valuable resources like poly-hydroxy-alkanoates. Gaining a deeper understanding of the microbial communities underpinning these technologies would enable their optimization, ultimately reducing costs and increasing efficiency. To support this research, we quantitatively compared classification methods differing in read length (raw reads, contigs and MAGs), overall search approach (Kaiju, Kraken2, RiboFrame and kMetaShot), as well as source databases to assess the classification performances at both the genus and species levels using an in silico-generated mock community designed to provide a simplified yet comprehensive representation of the complex microbial ecosystems found in AS and AGS. Particular attention was given to the misclassification of eukaryotes as bacteria and vice versa, as well as the occurrence of false negatives. Notably, Kaiju emerged as the most accurate classifier at both the genus and species levels, followed by RiboFrame and kMetaShot. However, our findings highlight the substantial risk of misclassification across all classifiers and databases, which could significantly hinder the advancement of these technologies by introducing noises and mistakes for key microbial clades.https://doi.org/10.1038/s41598-025-07734-8WastewaterMicrobial communityClassificationsAerobic granular sludgeBenchmark
spellingShingle Leandro Di Gloria
Lorenzo Casbarra
Tommaso Lotti
Matteo Ramazzotti
Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
Scientific Reports
Wastewater
Microbial community
Classifications
Aerobic granular sludge
Benchmark
title Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
title_full Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
title_fullStr Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
title_full_unstemmed Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
title_short Testing the limits of short-reads metagenomic classifications programs in wastewater treating microbial communities
title_sort testing the limits of short reads metagenomic classifications programs in wastewater treating microbial communities
topic Wastewater
Microbial community
Classifications
Aerobic granular sludge
Benchmark
url https://doi.org/10.1038/s41598-025-07734-8
work_keys_str_mv AT leandrodigloria testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities
AT lorenzocasbarra testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities
AT tommasolotti testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities
AT matteoramazzotti testingthelimitsofshortreadsmetagenomicclassificationsprogramsinwastewatertreatingmicrobialcommunities