AI-driven prediction of drug activity against Toxoplasma gondii: Data augmentation and deep neural networks for limited datasets
Toxoplasmosis, caused by Toxoplasma gondii (T. gondii), is a serious global health concern, particularly in immunocompromised individuals. Inhibiting the enzyme TgDHFR is a promising strategy for developing treatments. This Artificial Intelligence (AI)-driven Quantitative Structure-Activity Relation...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-06-01
|
Series: | Artificial Intelligence Chemistry |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2949747725000016 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Toxoplasmosis, caused by Toxoplasma gondii (T. gondii), is a serious global health concern, particularly in immunocompromised individuals. Inhibiting the enzyme TgDHFR is a promising strategy for developing treatments. This Artificial Intelligence (AI)-driven Quantitative Structure-Activity Relationship (QSAR) study applies deep neural networks (DNNs) to predict pIC50 values for potential inhibitors, using 2D and 3D molecular descriptors and fingerprints. To address training data limitations, we introduced a novel methodology combining targeted descriptor selection, Gaussian noise-based data augmentation, and an ensemble of DNNs. This approach significantly enhanced model performance, increasing the R² from 0.75 with the original dataset to 0.85. The model was further validated using two FDA-approved drugs for T. gondii treatment—pyrimethamine and trimethoprim—yielding relative errors of 3.35 % and 2.15 % in pIC50 predictions compared to experimental values. Finally, the model was applied to screen FDA-approved drugs after filtering out molecules that did not align with the characteristics of the training dataset. The predicted pIC50 values were further used to calculate ligand efficiency (LE), binding efficiency index (BEI), lipophilic ligand efficiency (LLE), and surface efficiency index (SEI), identifying the most promising TgDHFR inhibitors for further investigation. By leveraging AI and data augmentation approach, this study provides a powerful tool for pIC50 predictions of TgDHFR inhibitors, which can be adapted to other systems. |
---|---|
ISSN: | 2949-7477 |