GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information

Drug–target affinity (DTA) prediction is a critical aspect of drug discovery. The meaningful representation of drugs and targets is crucial for accurate prediction. Using 1D string-based representations for drugs and targets is a common approach that has demonstrated good results in drug–target affi...

Full description

Saved in:
Bibliographic Details
Main Authors: Kusal Debnath, Pratip Rana, Preetam Ghosh
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/15/3/405
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850203976632369152
author Kusal Debnath
Pratip Rana
Preetam Ghosh
author_facet Kusal Debnath
Pratip Rana
Preetam Ghosh
author_sort Kusal Debnath
collection DOAJ
description Drug–target affinity (DTA) prediction is a critical aspect of drug discovery. The meaningful representation of drugs and targets is crucial for accurate prediction. Using 1D string-based representations for drugs and targets is a common approach that has demonstrated good results in drug–target affinity prediction. However, these approach lacks information on the relative position of the atoms and bonds. To address this limitation, graph-based representations have been used to some extent. However, solely considering the structural aspect of drugs and targets may be insufficient for accurate DTA prediction. Integrating the functional aspect of these drugs at the genetic level can enhance the prediction capability of the models. To fill this gap, we propose GramSeq-DTA, which integrates chemical perturbation information with the structural information of drugs and targets. We applied a Grammar Variational Autoencoder (GVAE) for drug feature extraction and utilized two different approaches for protein feature extraction as follows: a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). The chemical perturbation data are obtained from the L1000 project, which provides information on the up-regulation and down-regulation of genes caused by selected drugs. This chemical perturbation information is processed, and a compact dataset is prepared, serving as the functional feature set of the drugs. By integrating the drug, gene, and target features in the model, our approach outperforms the current state-of-the-art DTA prediction models when validated on widely used DTA datasets (BindingDB, Davis, and KIBA). This work provides a novel and practical approach to DTA prediction by merging the structural and functional aspects of biological entities, and it encourages further research in multi-modal DTA prediction.
format Article
id doaj-art-69be715cf3a341f6a294b71bb07d161e
institution OA Journals
issn 2218-273X
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Biomolecules
spelling doaj-art-69be715cf3a341f6a294b71bb07d161e2025-08-20T02:11:22ZengMDPI AGBiomolecules2218-273X2025-03-0115340510.3390/biom15030405GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression InformationKusal Debnath0Pratip Rana1Preetam Ghosh2Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Computer Science, Old Dominion University, Norfolk, VA 23529, USADepartment of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USADrug–target affinity (DTA) prediction is a critical aspect of drug discovery. The meaningful representation of drugs and targets is crucial for accurate prediction. Using 1D string-based representations for drugs and targets is a common approach that has demonstrated good results in drug–target affinity prediction. However, these approach lacks information on the relative position of the atoms and bonds. To address this limitation, graph-based representations have been used to some extent. However, solely considering the structural aspect of drugs and targets may be insufficient for accurate DTA prediction. Integrating the functional aspect of these drugs at the genetic level can enhance the prediction capability of the models. To fill this gap, we propose GramSeq-DTA, which integrates chemical perturbation information with the structural information of drugs and targets. We applied a Grammar Variational Autoencoder (GVAE) for drug feature extraction and utilized two different approaches for protein feature extraction as follows: a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). The chemical perturbation data are obtained from the L1000 project, which provides information on the up-regulation and down-regulation of genes caused by selected drugs. This chemical perturbation information is processed, and a compact dataset is prepared, serving as the functional feature set of the drugs. By integrating the drug, gene, and target features in the model, our approach outperforms the current state-of-the-art DTA prediction models when validated on widely used DTA datasets (BindingDB, Davis, and KIBA). This work provides a novel and practical approach to DTA prediction by merging the structural and functional aspects of biological entities, and it encourages further research in multi-modal DTA prediction.https://www.mdpi.com/2218-273X/15/3/405drug–target affinitydeep learninggrammar-based encodingchemical perturbationmulti-modal
spellingShingle Kusal Debnath
Pratip Rana
Preetam Ghosh
GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
Biomolecules
drug–target affinity
deep learning
grammar-based encoding
chemical perturbation
multi-modal
title GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
title_full GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
title_fullStr GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
title_full_unstemmed GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
title_short GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information
title_sort gramseq dta a grammar based drug target affinity prediction approach fusing gene expression information
topic drug–target affinity
deep learning
grammar-based encoding
chemical perturbation
multi-modal
url https://www.mdpi.com/2218-273X/15/3/405
work_keys_str_mv AT kusaldebnath gramseqdtaagrammarbaseddrugtargetaffinitypredictionapproachfusinggeneexpressioninformation
AT pratiprana gramseqdtaagrammarbaseddrugtargetaffinitypredictionapproachfusinggeneexpressioninformation
AT preetamghosh gramseqdtaagrammarbaseddrugtargetaffinitypredictionapproachfusinggeneexpressioninformation