Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs

Abstract We have adopted the classification Read-Across Structure–Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML mod...

Full description

Saved in:

Bibliographic Details
Main Authors:	Arkaprava Banerjee, Kunal Roy
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Subjects:	QSAR c-RASAR Machine learning Sum of Ranking Differences (SRD) Nephrotoxicity ARKA
Online Access:	https://doi.org/10.1038/s41598-024-85063-y
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841559592661680128
author	Arkaprava Banerjee Kunal Roy
author_facet	Arkaprava Banerjee Kunal Roy
author_sort	Arkaprava Banerjee
collection	DOAJ
description	Abstract We have adopted the classification Read-Across Structure–Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply “descriptors” in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as “fingerprints” in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models. All 36 models were cross-validated 20 times with a fivefold cross-validation strategy, and their predictivity was checked on the test set data. A multi-criteria decision-making strategy – the Sum of Ranking Differences (SRD) approach—was adopted to identify the best-performing model based on robustness and external validation parameters. This statistical analysis suggested that the c-RASAR models had an overall good performance, while the best-performing model was also a c-RASAR model (LDA c-RASAR model derived from topological descriptors, with MCC values of 0.229 and 0.431 for the training and test sets, respectively). This model was used to screen a true external data set prepared from the known nephrotoxic compounds of DrugBankDB, demonstrating good predictivity.
format	Article
id	doaj-art-4404d6f8b5f346e0800aa38ca1f29d46
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-4404d6f8b5f346e0800aa38ca1f29d462025-01-05T12:21:21ZengNature PortfolioScientific Reports2045-23222025-01-0115112010.1038/s41598-024-85063-yMachine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugsArkaprava Banerjee0Kunal Roy1Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur UniversityDrug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur UniversityAbstract We have adopted the classification Read-Across Structure–Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply “descriptors” in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as “fingerprints” in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models. All 36 models were cross-validated 20 times with a fivefold cross-validation strategy, and their predictivity was checked on the test set data. A multi-criteria decision-making strategy – the Sum of Ranking Differences (SRD) approach—was adopted to identify the best-performing model based on robustness and external validation parameters. This statistical analysis suggested that the c-RASAR models had an overall good performance, while the best-performing model was also a c-RASAR model (LDA c-RASAR model derived from topological descriptors, with MCC values of 0.229 and 0.431 for the training and test sets, respectively). This model was used to screen a true external data set prepared from the known nephrotoxic compounds of DrugBankDB, demonstrating good predictivity.https://doi.org/10.1038/s41598-024-85063-yQSARc-RASARMachine learningSum of Ranking Differences (SRD)NephrotoxicityARKA
spellingShingle	Arkaprava Banerjee Kunal Roy Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs Scientific Reports QSAR c-RASAR Machine learning Sum of Ranking Differences (SRD) Nephrotoxicity ARKA
title	Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs
title_full	Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs
title_fullStr	Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs
title_full_unstemmed	Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs
title_short	Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs
title_sort	machine learning assisted classification rasar modeling for the nephrotoxicity potential of a curated set of orally active drugs
topic	QSAR c-RASAR Machine learning Sum of Ranking Differences (SRD) Nephrotoxicity ARKA
url	https://doi.org/10.1038/s41598-024-85063-y
work_keys_str_mv	AT arkapravabanerjee machinelearningassistedclassificationrasarmodelingforthenephrotoxicitypotentialofacuratedsetoforallyactivedrugs AT kunalroy machinelearningassistedclassificationrasarmodelingforthenephrotoxicitypotentialofacuratedsetoforallyactivedrugs

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs

Similar Items