SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development

IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-ba...

Full description

Saved in:
Bibliographic Details
Main Authors: Ottavia Spiga, Anna Visibelli, Francesco Pettini, Bianca Roncaglia, Annalisa Santucci
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Cellular and Infection Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823859153648484352
author Ottavia Spiga
Ottavia Spiga
Ottavia Spiga
Anna Visibelli
Francesco Pettini
Bianca Roncaglia
Annalisa Santucci
Annalisa Santucci
Annalisa Santucci
author_facet Ottavia Spiga
Ottavia Spiga
Ottavia Spiga
Anna Visibelli
Francesco Pettini
Bianca Roncaglia
Annalisa Santucci
Annalisa Santucci
Annalisa Santucci
author_sort Ottavia Spiga
collection DOAJ
description IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-based framework for predicting immunogenic proteins in Salmonella species. The model was trained and validated using a curated dataset of experimentally verified immunogenic and non-immunogenic proteins. Three distinct feature groups were extracted from protein sequences: global properties, sequence-derived features, and structural information. The Extreme Gradient Boosting (XGBoost) algorithm was employed for model development and optimization.ResultsSHASI-ML demonstrated robust performance in identifying bacterial immunogens, achieving 89.3% precision and 91.2% specificity. When applied to the Salmonella enterica serovar Typhimurium proteome, the model identified 292 novel immunogenic protein candidates. Global properties emerged as the most influential feature group in prediction accuracy, followed by structural and sequence information. The model showed superior recall and F1-scores compared to existing computational approaches.DiscussionThese findings establish SHASI-ML as an efficient computational tool for prioritizing immunogenic candidates in Salmonella vaccine development. By streamlining the identification of vaccine candidates early in the development process, this approach significantly reduces experimental burden and associated costs. The methodology can be applied to guide and optimize both research and industrial-scale production of Salmonella vaccines, potentially accelerating the development of more effective immunization strategies.
format Article
id doaj-art-045ab71159874c8a88af69260d85081b
institution Kabale University
issn 2235-2988
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Cellular and Infection Microbiology
spelling doaj-art-045ab71159874c8a88af69260d85081b2025-02-11T07:00:03ZengFrontiers Media S.A.Frontiers in Cellular and Infection Microbiology2235-29882025-02-011510.3389/fcimb.2025.15361561536156SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine developmentOttavia Spiga0Ottavia Spiga1Ottavia Spiga2Anna Visibelli3Francesco Pettini4Bianca Roncaglia5Annalisa Santucci6Annalisa Santucci7Annalisa Santucci8Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyCompetence Center Advanced Robotics and enabling digital TEchnologies & Systems 4.0 (ARTES 4.0), Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySienabioACTIVE-SbA, Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySchool of Medicine and Surgery, University of Milano-Bicocca, Monza, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyCompetence Center Advanced Robotics and enabling digital TEchnologies & Systems 4.0 (ARTES 4.0), Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySienabioACTIVE-SbA, Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyIntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-based framework for predicting immunogenic proteins in Salmonella species. The model was trained and validated using a curated dataset of experimentally verified immunogenic and non-immunogenic proteins. Three distinct feature groups were extracted from protein sequences: global properties, sequence-derived features, and structural information. The Extreme Gradient Boosting (XGBoost) algorithm was employed for model development and optimization.ResultsSHASI-ML demonstrated robust performance in identifying bacterial immunogens, achieving 89.3% precision and 91.2% specificity. When applied to the Salmonella enterica serovar Typhimurium proteome, the model identified 292 novel immunogenic protein candidates. Global properties emerged as the most influential feature group in prediction accuracy, followed by structural and sequence information. The model showed superior recall and F1-scores compared to existing computational approaches.DiscussionThese findings establish SHASI-ML as an efficient computational tool for prioritizing immunogenic candidates in Salmonella vaccine development. By streamlining the identification of vaccine candidates early in the development process, this approach significantly reduces experimental burden and associated costs. The methodology can be applied to guide and optimize both research and industrial-scale production of Salmonella vaccines, potentially accelerating the development of more effective immunization strategies.https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/fullSalmonellaartificial intelligencemachine learningvaccinesimmunogenicity
spellingShingle Ottavia Spiga
Ottavia Spiga
Ottavia Spiga
Anna Visibelli
Francesco Pettini
Bianca Roncaglia
Annalisa Santucci
Annalisa Santucci
Annalisa Santucci
SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
Frontiers in Cellular and Infection Microbiology
Salmonella
artificial intelligence
machine learning
vaccines
immunogenicity
title SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
title_full SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
title_fullStr SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
title_full_unstemmed SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
title_short SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
title_sort shasi ml a machine learning based approach for immunogenicity prediction in salmonella vaccine development
topic Salmonella
artificial intelligence
machine learning
vaccines
immunogenicity
url https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/full
work_keys_str_mv AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT annavisibelli shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT francescopettini shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT biancaroncaglia shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment
AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment