A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
Thermostable β-glucosidases (E.C. 3.2.1.21) are essential enzymes used in second-generation biofuel production. However, little is known about the structural characteristics that lead to their thermostability. In this study, I used graph-based structural signatures to represent three-dimensional str...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/9/4839 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Thermostable β-glucosidases (E.C. 3.2.1.21) are essential enzymes used in second-generation biofuel production. However, little is known about the structural characteristics that lead to their thermostability. In this study, I used graph-based structural signatures to represent three-dimensional structures of β-glucosidase enzymes extracted from thermophilic organisms. I collected 1717 structures from thermophilic (<i>n</i> = 890) and non-thermophilic (<i>n</i> = 827) organisms and divided them into two datasets: training (<i>n</i> = 1134) and test (<i>n</i> = 583). I then used seven machine learning algorithms to classify them. The best model achieved 77.1% accuracy using logistic regression in training with 10-fold cross-validation and 81.6% accuracy in testing using the CatBoost algorithm. I hypothesize that the signature model proposed here can help understand the structural patterns in thermostable enzymes and shed light on the design of more efficient enzymes for biofuel production. |
|---|---|
| ISSN: | 2076-3417 |