Machine Learning Monte Carlo Approaches and Statistical Physics Notions to Characterize Bacterial Species in Human Microbiota

Recent studies have shown correlations between the microbiota’s composition and various health conditions. Machine learning (ML) techniques are essential for analyzing complex biological data, particularly in microbiome research. ML methods help analyze large datasets to uncover microbiota patterns...

Full description

Saved in:
Bibliographic Details
Main Authors: Michele Bellingeri, Leonardo Mancabelli, Christian Milani, Gabriele Andrea Lugli, Roberto Alfieri, Massimiliano Turchetto, Marco Ventura, Davide Cassi
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/6/4/117
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent studies have shown correlations between the microbiota’s composition and various health conditions. Machine learning (ML) techniques are essential for analyzing complex biological data, particularly in microbiome research. ML methods help analyze large datasets to uncover microbiota patterns and understand how these patterns affect human health. This study introduces a novel approach combining statistical physics with the Monte Carlo (MC) methods to characterize bacterial species in the human microbiota. We assess the significance of bacterial species in different age groups by using notions of statistical distances to evaluate species prevalence and abundance across age groups and employing MC simulations based on statistical mechanics principles. Our findings show that the microbiota composition experiences a significant transition from early childhood to adulthood. Species such as <i>Bifidobacterium breve</i> and <i>Veillonella parvula</i> decrease with age, while others like <i>Agathobaculum butyriciproducens</i> and <i>Eubacterium rectale</i> increase. Additionally, low-prevalence species may hold significant importance in characterizing age groups. Finally, we propose an overall species ranking by integrating the methods proposed here in a multicriteria classification strategy. Our research provides a comprehensive tool for microbiota analysis using statistical notions, ML techniques, and MC simulations.
ISSN:2504-4990