SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development

IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-ba...

Full description

Saved in:
Bibliographic Details
Main Authors: Ottavia Spiga, Anna Visibelli, Francesco Pettini, Bianca Roncaglia, Annalisa Santucci
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Cellular and Infection Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-based framework for predicting immunogenic proteins in Salmonella species. The model was trained and validated using a curated dataset of experimentally verified immunogenic and non-immunogenic proteins. Three distinct feature groups were extracted from protein sequences: global properties, sequence-derived features, and structural information. The Extreme Gradient Boosting (XGBoost) algorithm was employed for model development and optimization.ResultsSHASI-ML demonstrated robust performance in identifying bacterial immunogens, achieving 89.3% precision and 91.2% specificity. When applied to the Salmonella enterica serovar Typhimurium proteome, the model identified 292 novel immunogenic protein candidates. Global properties emerged as the most influential feature group in prediction accuracy, followed by structural and sequence information. The model showed superior recall and F1-scores compared to existing computational approaches.DiscussionThese findings establish SHASI-ML as an efficient computational tool for prioritizing immunogenic candidates in Salmonella vaccine development. By streamlining the identification of vaccine candidates early in the development process, this approach significantly reduces experimental burden and associated costs. The methodology can be applied to guide and optimize both research and industrial-scale production of Salmonella vaccines, potentially accelerating the development of more effective immunization strategies.
ISSN:2235-2988