AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets
Toxoplasmosis, caused by Toxoplasma gondii (T. gondii), is a serious global health concern, particularly in immunocompromised individuals. Inhibiting the enzyme TgDHFR is a promising strategy for developing treatments. This Artificial Intelligence (AI)-driven Quantitative Structure-Activity Relation...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-06-01
|
Series: | Artificial Intelligence Chemistry |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2949747725000016 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823861118261526528 |
---|---|
author | Natalia V. Karimova Ravithree D. Senanayake |
author_facet | Natalia V. Karimova Ravithree D. Senanayake |
author_sort | Natalia V. Karimova |
collection | DOAJ |
description | Toxoplasmosis, caused by Toxoplasma gondii (T. gondii), is a serious global health concern, particularly in immunocompromised individuals. Inhibiting the enzyme TgDHFR is a promising strategy for developing treatments. This Artificial Intelligence (AI)-driven Quantitative Structure-Activity Relationship (QSAR) study applies deep neural networks (DNNs) to predict pIC50 values for potential inhibitors, using 2D and 3D molecular descriptors and fingerprints. To address training data limitations, we introduced a novel methodology combining targeted descriptor selection, Gaussian noise-based data augmentation, and an ensemble of DNNs. This approach significantly enhanced model performance, increasing the R² from 0.75 with the original dataset to 0.85. The model was further validated using two FDA-approved drugs for T. gondii treatment—pyrimethamine and trimethoprim—yielding relative errors of 3.35 % and 2.15 % in pIC50 predictions compared to experimental values. Finally, the model was applied to screen FDA-approved drugs after filtering out molecules that did not align with the characteristics of the training dataset. The predicted pIC50 values were further used to calculate ligand efficiency (LE), binding efficiency index (BEI), lipophilic ligand efficiency (LLE), and surface efficiency index (SEI), identifying the most promising TgDHFR inhibitors for further investigation. By leveraging AI and data augmentation approach, this study provides a powerful tool for pIC50 predictions of TgDHFR inhibitors, which can be adapted to other systems. |
format | Article |
id | doaj-art-8e3c0acaa45b4159b0e9f300ea82a284 |
institution | Kabale University |
issn | 2949-7477 |
language | English |
publishDate | 2025-06-01 |
publisher | Elsevier |
record_format | Article |
series | Artificial Intelligence Chemistry |
spelling | doaj-art-8e3c0acaa45b4159b0e9f300ea82a2842025-02-10T04:35:34ZengElsevierArtificial Intelligence Chemistry2949-74772025-06-0131100084AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasetsNatalia V. Karimova0Ravithree D. Senanayake1University of California Irvine, Chemistry Department, Irvine, CA 92697, USA; Corresponding author.University of Minnesota, Chemistry Department, Minneapolis, MI 55455, USAToxoplasmosis, caused by Toxoplasma gondii (T. gondii), is a serious global health concern, particularly in immunocompromised individuals. Inhibiting the enzyme TgDHFR is a promising strategy for developing treatments. This Artificial Intelligence (AI)-driven Quantitative Structure-Activity Relationship (QSAR) study applies deep neural networks (DNNs) to predict pIC50 values for potential inhibitors, using 2D and 3D molecular descriptors and fingerprints. To address training data limitations, we introduced a novel methodology combining targeted descriptor selection, Gaussian noise-based data augmentation, and an ensemble of DNNs. This approach significantly enhanced model performance, increasing the R² from 0.75 with the original dataset to 0.85. The model was further validated using two FDA-approved drugs for T. gondii treatment—pyrimethamine and trimethoprim—yielding relative errors of 3.35 % and 2.15 % in pIC50 predictions compared to experimental values. Finally, the model was applied to screen FDA-approved drugs after filtering out molecules that did not align with the characteristics of the training dataset. The predicted pIC50 values were further used to calculate ligand efficiency (LE), binding efficiency index (BEI), lipophilic ligand efficiency (LLE), and surface efficiency index (SEI), identifying the most promising TgDHFR inhibitors for further investigation. By leveraging AI and data augmentation approach, this study provides a powerful tool for pIC50 predictions of TgDHFR inhibitors, which can be adapted to other systems.http://www.sciencedirect.com/science/article/pii/S2949747725000016Artificial intelligenceDeep learningMachine learningToxoplasma gondiiTgDHFRPIC50 |
spellingShingle | Natalia V. Karimova Ravithree D. Senanayake AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets Artificial Intelligence Chemistry Artificial intelligence Deep learning Machine learning Toxoplasma gondii TgDHFR PIC50 |
title | AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets |
title_full | AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets |
title_fullStr | AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets |
title_full_unstemmed | AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets |
title_short | AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets |
title_sort | ai driven prediction of drug activity against toxoplasma gondii data augmentation and deep neural networks for limited datasets |
topic | Artificial intelligence Deep learning Machine learning Toxoplasma gondii TgDHFR PIC50 |
url | http://www.sciencedirect.com/science/article/pii/S2949747725000016 |
work_keys_str_mv | AT nataliavkarimova aidrivenpredictionofdrugactivityagainsttoxoplasmagondiidataaugmentationanddeepneuralnetworksforlimiteddatasets AT ravithreedsenanayake aidrivenpredictionofdrugactivityagainsttoxoplasmagondiidataaugmentationanddeepneuralnetworksforlimiteddatasets |