A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases

Thermostable β-glucosidases (E.C. 3.2.1.21) are essential enzymes used in second-generation biofuel production. However, little is known about the structural characteristics that lead to their thermostability. In this study, I used graph-based structural signatures to represent three-dimensional str...

Full description

Saved in:
Bibliographic Details
Main Author: Diego Mariano
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/9/4839
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849312726791225344
author Diego Mariano
author_facet Diego Mariano
author_sort Diego Mariano
collection DOAJ
description Thermostable β-glucosidases (E.C. 3.2.1.21) are essential enzymes used in second-generation biofuel production. However, little is known about the structural characteristics that lead to their thermostability. In this study, I used graph-based structural signatures to represent three-dimensional structures of β-glucosidase enzymes extracted from thermophilic organisms. I collected 1717 structures from thermophilic (<i>n</i> = 890) and non-thermophilic (<i>n</i> = 827) organisms and divided them into two datasets: training (<i>n</i> = 1134) and test (<i>n</i> = 583). I then used seven machine learning algorithms to classify them. The best model achieved 77.1% accuracy using logistic regression in training with 10-fold cross-validation and 81.6% accuracy in testing using the CatBoost algorithm. I hypothesize that the signature model proposed here can help understand the structural patterns in thermostable enzymes and shed light on the design of more efficient enzymes for biofuel production.
format Article
id doaj-art-9e0e8267f4794cdba01705f51325e01b
institution Kabale University
issn 2076-3417
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-9e0e8267f4794cdba01705f51325e01b2025-08-20T03:52:57ZengMDPI AGApplied Sciences2076-34172025-04-01159483910.3390/app15094839A Machine Learning Approach for the Prediction of Thermostable β-GlucosidasesDiego Mariano0Department of Computer Science (DCC), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, BrazilThermostable β-glucosidases (E.C. 3.2.1.21) are essential enzymes used in second-generation biofuel production. However, little is known about the structural characteristics that lead to their thermostability. In this study, I used graph-based structural signatures to represent three-dimensional structures of β-glucosidase enzymes extracted from thermophilic organisms. I collected 1717 structures from thermophilic (<i>n</i> = 890) and non-thermophilic (<i>n</i> = 827) organisms and divided them into two datasets: training (<i>n</i> = 1134) and test (<i>n</i> = 583). I then used seven machine learning algorithms to classify them. The best model achieved 77.1% accuracy using logistic regression in training with 10-fold cross-validation and 81.6% accuracy in testing using the CatBoost algorithm. I hypothesize that the signature model proposed here can help understand the structural patterns in thermostable enzymes and shed light on the design of more efficient enzymes for biofuel production.https://www.mdpi.com/2076-3417/15/9/4839β-glucosidasesmachine learninggraph-based structural signatures
spellingShingle Diego Mariano
A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
Applied Sciences
β-glucosidases
machine learning
graph-based structural signatures
title A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
title_full A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
title_fullStr A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
title_full_unstemmed A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
title_short A Machine Learning Approach for the Prediction of Thermostable β-Glucosidases
title_sort machine learning approach for the prediction of thermostable β glucosidases
topic β-glucosidases
machine learning
graph-based structural signatures
url https://www.mdpi.com/2076-3417/15/9/4839
work_keys_str_mv AT diegomariano amachinelearningapproachforthepredictionofthermostablebglucosidases
AT diegomariano machinelearningapproachforthepredictionofthermostablebglucosidases