Applying machine learning to classify table olives using bacterial metataxonomic data

Abstract In recent years, metataxonomic analysis has been increasingly used to characterize microbial communities in fermented foods. Moreover, advances in bioinformatics and machine learning (ML) have expanded resources for analyzing these metataxonomic data. Particularly tree-based algorithms are...

Full description

Saved in:
Bibliographic Details
Main Authors: Elio López-García, Antonio Benítez-Cabello, Francisco Noé Arroyo-López
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:npj Science of Food
Online Access:https://doi.org/10.1038/s41538-025-00496-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract In recent years, metataxonomic analysis has been increasingly used to characterize microbial communities in fermented foods. Moreover, advances in bioinformatics and machine learning (ML) have expanded resources for analyzing these metataxonomic data. Particularly tree-based algorithms are valuable for their interpretability. This work compares the use of three tree-based ML algorithms—Classification and Regression Tree, Random Forest (RF), and Extreme Gradient Boosting— for the analysis of a database composed of 442 samples of 16S rRNA bacterial profiles obtained from table olives. Our findings show that ML techniques can effectively classify bacterial profiles based on olive processing type, cultivar, country of origin, and isolation matrix. The RF model achieved the highest accuracy, reaching 97% in the best cases, with a kappa coefficient above 0.8 for most categories. This approach holds potential applications in the table olive sector and in other food products, where the industrial application of ML techniques could enhance traceability, authenticity, and quality control.
ISSN:2396-8370