A computational approach for prediction of viscosity of chemical compounds based on molecular structures

The research paper explores the feasibility of predicting the viscosity of a diverse chemical compound by using molecular structures at 25 °C through supervised machine learning methods. In this paper, Random Forest, Gradient Boosting and CatBoost supervised algorithms were implemented. The dataset...

Full description

Saved in:
Bibliographic Details
Main Authors: Sneha Das, Ram Kishore Roy, Tulshi Bezboruah
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Results in Chemistry
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2211715625000220
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The research paper explores the feasibility of predicting the viscosity of a diverse chemical compound by using molecular structures at 25 °C through supervised machine learning methods. In this paper, Random Forest, Gradient Boosting and CatBoost supervised algorithms were implemented. The dataset consists of the Simplified Molecular Input Line Entry System (SMILES) notation of 320 chemical compounds and their corresponding viscosities at 25 °C. The study generated ten features from the compounds to correlate and predict the viscosity values. The results suggest that Catboost algorithm (R2 = 0.94 and MSE = 0.64) performs better than Gradient Boosting (R2 = 0.90 and MSE = 1.05) and Random Forest algorithm (R2 = 0.86 and MSE = 1.39), while measuring the viscosity. Although the Random Forest model lags behind, the overall results may be useful for predicting viscosity with accuracy. The findings of the research work also reveal which feature contributes the most to a given model. The results of the proposed model can be used in several industries including food and beverage, cosmetics, medicine, pharmaceuticals, etc.
ISSN:2211-7156