SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development
IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-ba...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-02-01
|
Series: | Frontiers in Cellular and Infection Microbiology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823859153648484352 |
---|---|
author | Ottavia Spiga Ottavia Spiga Ottavia Spiga Anna Visibelli Francesco Pettini Bianca Roncaglia Annalisa Santucci Annalisa Santucci Annalisa Santucci |
author_facet | Ottavia Spiga Ottavia Spiga Ottavia Spiga Anna Visibelli Francesco Pettini Bianca Roncaglia Annalisa Santucci Annalisa Santucci Annalisa Santucci |
author_sort | Ottavia Spiga |
collection | DOAJ |
description | IntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-based framework for predicting immunogenic proteins in Salmonella species. The model was trained and validated using a curated dataset of experimentally verified immunogenic and non-immunogenic proteins. Three distinct feature groups were extracted from protein sequences: global properties, sequence-derived features, and structural information. The Extreme Gradient Boosting (XGBoost) algorithm was employed for model development and optimization.ResultsSHASI-ML demonstrated robust performance in identifying bacterial immunogens, achieving 89.3% precision and 91.2% specificity. When applied to the Salmonella enterica serovar Typhimurium proteome, the model identified 292 novel immunogenic protein candidates. Global properties emerged as the most influential feature group in prediction accuracy, followed by structural and sequence information. The model showed superior recall and F1-scores compared to existing computational approaches.DiscussionThese findings establish SHASI-ML as an efficient computational tool for prioritizing immunogenic candidates in Salmonella vaccine development. By streamlining the identification of vaccine candidates early in the development process, this approach significantly reduces experimental burden and associated costs. The methodology can be applied to guide and optimize both research and industrial-scale production of Salmonella vaccines, potentially accelerating the development of more effective immunization strategies. |
format | Article |
id | doaj-art-045ab71159874c8a88af69260d85081b |
institution | Kabale University |
issn | 2235-2988 |
language | English |
publishDate | 2025-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Cellular and Infection Microbiology |
spelling | doaj-art-045ab71159874c8a88af69260d85081b2025-02-11T07:00:03ZengFrontiers Media S.A.Frontiers in Cellular and Infection Microbiology2235-29882025-02-011510.3389/fcimb.2025.15361561536156SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine developmentOttavia Spiga0Ottavia Spiga1Ottavia Spiga2Anna Visibelli3Francesco Pettini4Bianca Roncaglia5Annalisa Santucci6Annalisa Santucci7Annalisa Santucci8Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyCompetence Center Advanced Robotics and enabling digital TEchnologies & Systems 4.0 (ARTES 4.0), Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySienabioACTIVE-SbA, Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySchool of Medicine and Surgery, University of Milano-Bicocca, Monza, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyDepartment of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyCompetence Center Advanced Robotics and enabling digital TEchnologies & Systems 4.0 (ARTES 4.0), Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalySienabioACTIVE-SbA, Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, ItalyIntroductionAccurate prediction of immunogenic proteins is crucial for vaccine development and understanding host-pathogen interactions in bacterial diseases, particularly for Salmonella infections which remain a significant global health challenge.MethodsWe developed SHASI-ML, a machine learning-based framework for predicting immunogenic proteins in Salmonella species. The model was trained and validated using a curated dataset of experimentally verified immunogenic and non-immunogenic proteins. Three distinct feature groups were extracted from protein sequences: global properties, sequence-derived features, and structural information. The Extreme Gradient Boosting (XGBoost) algorithm was employed for model development and optimization.ResultsSHASI-ML demonstrated robust performance in identifying bacterial immunogens, achieving 89.3% precision and 91.2% specificity. When applied to the Salmonella enterica serovar Typhimurium proteome, the model identified 292 novel immunogenic protein candidates. Global properties emerged as the most influential feature group in prediction accuracy, followed by structural and sequence information. The model showed superior recall and F1-scores compared to existing computational approaches.DiscussionThese findings establish SHASI-ML as an efficient computational tool for prioritizing immunogenic candidates in Salmonella vaccine development. By streamlining the identification of vaccine candidates early in the development process, this approach significantly reduces experimental burden and associated costs. The methodology can be applied to guide and optimize both research and industrial-scale production of Salmonella vaccines, potentially accelerating the development of more effective immunization strategies.https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/fullSalmonellaartificial intelligencemachine learningvaccinesimmunogenicity |
spellingShingle | Ottavia Spiga Ottavia Spiga Ottavia Spiga Anna Visibelli Francesco Pettini Bianca Roncaglia Annalisa Santucci Annalisa Santucci Annalisa Santucci SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development Frontiers in Cellular and Infection Microbiology Salmonella artificial intelligence machine learning vaccines immunogenicity |
title | SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development |
title_full | SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development |
title_fullStr | SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development |
title_full_unstemmed | SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development |
title_short | SHASI-ML: a machine learning-based approach for immunogenicity prediction in Salmonella vaccine development |
title_sort | shasi ml a machine learning based approach for immunogenicity prediction in salmonella vaccine development |
topic | Salmonella artificial intelligence machine learning vaccines immunogenicity |
url | https://www.frontiersin.org/articles/10.3389/fcimb.2025.1536156/full |
work_keys_str_mv | AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT ottaviaspiga shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT annavisibelli shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT francescopettini shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT biancaroncaglia shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment AT annalisasantucci shasimlamachinelearningbasedapproachforimmunogenicitypredictioninsalmonellavaccinedevelopment |