Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning

Abstract Background Antimicrobial peptides (AMPs) are well known for their broad-spectrum activity and have shown great promise in addressing the antibiotic-resistant crisis. The Lactobacillaceae family, recognized for its health-promoting effects in humans, represents a valuable source of novel AMP...

Full description

Saved in:
Bibliographic Details
Main Authors: Rubing Du, Fei Han, Zhen Li, Jing Yu, Yan Xu, Yongguang Huang, Qun Wu
Format: Article
Language:English
Published: BMC 2025-06-01
Series:Microbiome
Subjects:
Online Access:https://doi.org/10.1186/s40168-025-02145-3
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850111737115705344
author Rubing Du
Fei Han
Zhen Li
Jing Yu
Yan Xu
Yongguang Huang
Qun Wu
author_facet Rubing Du
Fei Han
Zhen Li
Jing Yu
Yan Xu
Yongguang Huang
Qun Wu
author_sort Rubing Du
collection DOAJ
description Abstract Background Antimicrobial peptides (AMPs) are well known for their broad-spectrum activity and have shown great promise in addressing the antibiotic-resistant crisis. The Lactobacillaceae family, recognized for its health-promoting effects in humans, represents a valuable source of novel AMPs. However, the global prevalence and distribution of AMPs within Lactobacillaceae remains largely unknown, which limits the efficient discovery and development of novel AMPs. Results We analyzed all available genomes (10,327 genomes), encompassing 38 genera and 515 species, to investigate the biosynthetic potential (indicated by the number of AMP sequences in the genome) of AMP in the Lactobacillaceae family. We demonstrated Lactobacillaceae species had ubiquitous (69.90%) biosynthetic potential of AMPs. Overall, 9601 AMPs were identified, clustering into 2092 gene cluster families (GCFs), which showed strong interspecies specificity (95.27%), intraspecies heterogeneity (93.31%), and habitat uniqueness (95.83%), that greatly expanded on the AMP sequence landscape. Novelty assessment indicated that 1516 GCFs (72.47%) had no similarity to any known AMPs in existing databases. Machine learning predictions suggested that novel AMPs from Lactobacillaceae possessed strong antimicrobial potential, with 664 GCFs having an additive minimum inhibitory concentration (MIC) below 100 μM. We randomly synthesized 16 AMPs (with predicted MIC < 100 μM) and identified 10 AMPs exhibiting varied-spectrum activity against 11 common pathogens. Finally, we identified one Lactobacillus delbrueckii-originated AMP (delbruin_1) having broad-spectrum (all 11 pathogens) and high antimicrobial activity (average MIC = 38.56 µM), which proved its potential as a clinically viable antimicrobial agent. Conclusions We uncovered the global prevalence of AMPs in Lactobacillaceae and proved that Lactobacillaceae is an untapped and invaluable source of novel AMPs to combat the antibiotic-resistance crisis. Meanwhile, we provided a machine learning-guided framework for AMP discovery, offering a scalable roadmap for identifying novel AMPs not only in Lactobacillaceae but also in other organisms. Video Abstract
format Article
id doaj-art-2940dc1c417c45a0a809449e3c779b0d
institution OA Journals
issn 2049-2618
language English
publishDate 2025-06-01
publisher BMC
record_format Article
series Microbiome
spelling doaj-art-2940dc1c417c45a0a809449e3c779b0d2025-08-20T02:37:33ZengBMCMicrobiome2049-26182025-06-0113111410.1186/s40168-025-02145-3Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learningRubing Du0Fei Han1Zhen Li2Jing Yu3Yan Xu4Yongguang Huang5Qun Wu6Lab of Brewing Microbiology and Applied Enzymology, The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan UniversitySchool of Food Science and Pharmaceutical Engineering, Nanjing Normal UniversitySchool of Food Science and Pharmaceutical Engineering, Nanjing Normal UniversitySchool of Food Science and Pharmaceutical Engineering, Nanjing Normal UniversityLab of Brewing Microbiology and Applied Enzymology, The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan UniversitySchool of Liquor and Food Engineering, Key Laboratory of Fermentation Engineering and Biological Pharmacy of Guizhou Province, Guizhou UniversityLab of Brewing Microbiology and Applied Enzymology, The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan UniversityAbstract Background Antimicrobial peptides (AMPs) are well known for their broad-spectrum activity and have shown great promise in addressing the antibiotic-resistant crisis. The Lactobacillaceae family, recognized for its health-promoting effects in humans, represents a valuable source of novel AMPs. However, the global prevalence and distribution of AMPs within Lactobacillaceae remains largely unknown, which limits the efficient discovery and development of novel AMPs. Results We analyzed all available genomes (10,327 genomes), encompassing 38 genera and 515 species, to investigate the biosynthetic potential (indicated by the number of AMP sequences in the genome) of AMP in the Lactobacillaceae family. We demonstrated Lactobacillaceae species had ubiquitous (69.90%) biosynthetic potential of AMPs. Overall, 9601 AMPs were identified, clustering into 2092 gene cluster families (GCFs), which showed strong interspecies specificity (95.27%), intraspecies heterogeneity (93.31%), and habitat uniqueness (95.83%), that greatly expanded on the AMP sequence landscape. Novelty assessment indicated that 1516 GCFs (72.47%) had no similarity to any known AMPs in existing databases. Machine learning predictions suggested that novel AMPs from Lactobacillaceae possessed strong antimicrobial potential, with 664 GCFs having an additive minimum inhibitory concentration (MIC) below 100 μM. We randomly synthesized 16 AMPs (with predicted MIC < 100 μM) and identified 10 AMPs exhibiting varied-spectrum activity against 11 common pathogens. Finally, we identified one Lactobacillus delbrueckii-originated AMP (delbruin_1) having broad-spectrum (all 11 pathogens) and high antimicrobial activity (average MIC = 38.56 µM), which proved its potential as a clinically viable antimicrobial agent. Conclusions We uncovered the global prevalence of AMPs in Lactobacillaceae and proved that Lactobacillaceae is an untapped and invaluable source of novel AMPs to combat the antibiotic-resistance crisis. Meanwhile, we provided a machine learning-guided framework for AMP discovery, offering a scalable roadmap for identifying novel AMPs not only in Lactobacillaceae but also in other organisms. Video Abstracthttps://doi.org/10.1186/s40168-025-02145-3Antibiotic resistanceAntimicrobial peptidesGenome miningLactobacillaceaeMachine learning
spellingShingle Rubing Du
Fei Han
Zhen Li
Jing Yu
Yan Xu
Yongguang Huang
Qun Wu
Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
Microbiome
Antibiotic resistance
Antimicrobial peptides
Genome mining
Lactobacillaceae
Machine learning
title Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
title_full Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
title_fullStr Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
title_full_unstemmed Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
title_short Uncovering encrypted antimicrobial peptides in health-associated Lactobacillaceae by large-scale genomics and machine learning
title_sort uncovering encrypted antimicrobial peptides in health associated lactobacillaceae by large scale genomics and machine learning
topic Antibiotic resistance
Antimicrobial peptides
Genome mining
Lactobacillaceae
Machine learning
url https://doi.org/10.1186/s40168-025-02145-3
work_keys_str_mv AT rubingdu uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT feihan uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT zhenli uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT jingyu uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT yanxu uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT yongguanghuang uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning
AT qunwu uncoveringencryptedantimicrobialpeptidesinhealthassociatedlactobacillaceaebylargescalegenomicsandmachinelearning