SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility

Abstract Protein solubility problems arise in a wide range of applications, from antibody development to enzyme production, and are linked to several major disorders, including cataracts and Alzheimer’s diseases. To assist scientists in designing proteins with improved solubility and better understa...

Full description

Saved in:
Bibliographic Details
Main Authors: Simone Attanasio, Jean Kwasigroch, Marianne Rooman, Fabrizio Pucci
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-11326-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849763667360350208
author Simone Attanasio
Jean Kwasigroch
Marianne Rooman
Fabrizio Pucci
author_facet Simone Attanasio
Jean Kwasigroch
Marianne Rooman
Fabrizio Pucci
author_sort Simone Attanasio
collection DOAJ
description Abstract Protein solubility problems arise in a wide range of applications, from antibody development to enzyme production, and are linked to several major disorders, including cataracts and Alzheimer’s diseases. To assist scientists in designing proteins with improved solubility and better understand solubility-related diseases, we introduce SOuLMuSiC, a computational tool for the fast and accurate prediction of the impact of single-site mutations on protein solubility. Our model is based on a simple artificial neural network that takes as input a series of features, including biophysical properties of wild-type and mutated residues, energetic values computed using various statistical potentials, and mutational scores derived from protein language models. SOuLMuSiC has been trained on a curated dataset of about 700 single-site mutations with known solubility values, collected and manually verified from original literature. It significantly outperforms current state-of-the-art predictors in strict cross validation: the Spearman correlation reaches 0.5 when solubility changes are represented categorically; for the subset with quantitative values, it increases to 0.7. SOuLMuSiC also shows good performance on external datasets containing high-throughput enzyme solubility-related data as well as protein aggregation propensities. In summary, SOuLMuSiC is a valuable tool for identifying mutations that impact protein solubility, and can play a major role in the rational design of proteins with improved solubility and in understanding genetic variants’ effect. It is freely available for academic use at http://babylone.ulb.ac.be/SoulMuSiC/.
format Article
id doaj-art-6644fa3882aa46ab8440cf864f2ce14c
institution DOAJ
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-6644fa3882aa46ab8440cf864f2ce14c2025-08-20T03:05:21ZengNature PortfolioScientific Reports2045-23222025-07-0115111310.1038/s41598-025-11326-xSOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubilitySimone Attanasio0Jean Kwasigroch1Marianne Rooman2Fabrizio Pucci3Computational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesAbstract Protein solubility problems arise in a wide range of applications, from antibody development to enzyme production, and are linked to several major disorders, including cataracts and Alzheimer’s diseases. To assist scientists in designing proteins with improved solubility and better understand solubility-related diseases, we introduce SOuLMuSiC, a computational tool for the fast and accurate prediction of the impact of single-site mutations on protein solubility. Our model is based on a simple artificial neural network that takes as input a series of features, including biophysical properties of wild-type and mutated residues, energetic values computed using various statistical potentials, and mutational scores derived from protein language models. SOuLMuSiC has been trained on a curated dataset of about 700 single-site mutations with known solubility values, collected and manually verified from original literature. It significantly outperforms current state-of-the-art predictors in strict cross validation: the Spearman correlation reaches 0.5 when solubility changes are represented categorically; for the subset with quantitative values, it increases to 0.7. SOuLMuSiC also shows good performance on external datasets containing high-throughput enzyme solubility-related data as well as protein aggregation propensities. In summary, SOuLMuSiC is a valuable tool for identifying mutations that impact protein solubility, and can play a major role in the rational design of proteins with improved solubility and in understanding genetic variants’ effect. It is freely available for academic use at http://babylone.ulb.ac.be/SoulMuSiC/.https://doi.org/10.1038/s41598-025-11326-x
spellingShingle Simone Attanasio
Jean Kwasigroch
Marianne Rooman
Fabrizio Pucci
SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
Scientific Reports
title SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
title_full SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
title_fullStr SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
title_full_unstemmed SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
title_short SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility
title_sort soulmusic a novel tool for predicting the impact of mutations on protein solubility
url https://doi.org/10.1038/s41598-025-11326-x
work_keys_str_mv AT simoneattanasio soulmusicanoveltoolforpredictingtheimpactofmutationsonproteinsolubility
AT jeankwasigroch soulmusicanoveltoolforpredictingtheimpactofmutationsonproteinsolubility
AT mariannerooman soulmusicanoveltoolforpredictingtheimpactofmutationsonproteinsolubility
AT fabriziopucci soulmusicanoveltoolforpredictingtheimpactofmutationsonproteinsolubility