A robust and statistical analyzed predictive model for drug toxicity using machine learning

Abstract Over the years, toxicity prediction has been a challenging task. Artificial intelligence and machine learning provide a platform to study toxicity prediction more accurately with a reduced time span. An optimized ensembled model is used to contrast the results of seven machine learning algo...

Full description

Saved in:
Bibliographic Details
Main Authors: Deepak Rawat, Rohit Bajaj, Rachit Manchanda, Ankush Mehta, Prabhu Paramasivam, Suraj Kumar Bhagat, Abinet Gosaye Ayanie
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-02333-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Over the years, toxicity prediction has been a challenging task. Artificial intelligence and machine learning provide a platform to study toxicity prediction more accurately with a reduced time span. An optimized ensembled model is used to contrast the results of seven machine learning algorithms and three deep learning models with regard to state-of-the-art parameters. In the paper, optimized model is developed that combined eager random forest and sluggish k star techniques. State-of-the-art parameters have been evaluated and compared for three scenarios. In first scenario with original features, in the second scenario using feature selection and resampling technique with the percentage split method, and in the third scenario using feature selection and resampling technique with 10-fold cross-validation. The principal component analysis is performed for feature selection. An optimized ensembled model performs well in comparison to other models in all three scenarios. It achieved an accuracy of 77% in the first scenario, 89% in the second scenario, and 93% in the third scenario. The proposed model shows the performance increase in accuracy by 8% as compared to the top performer Kstar machine learning model and 21% as compared to deep learning model AIPs-DeepEnC-GA which is remarkable. Also there is significant improvement in other important evaluation parameters in comparison to top performing models. Further concept of W-saw score and L-saw is presented for all the scenarios. An optimized ensembled model using feature selection and resampling technique with tenfold cross-validation performs best among all machine learning models in all the scenarios.
ISSN:2045-2322