GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity

Abstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes...

Full description

Saved in:
Bibliographic Details
Main Authors: Junwei Luo, Ziguang Zhu, Zhenhan Xu, Chuanle Xiao, Jingjing Wei, Jiquan Shen
Format: Article
Language:English
Published: BMC 2025-02-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-025-11234-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825197683327893504
author Junwei Luo
Ziguang Zhu
Zhenhan Xu
Chuanle Xiao
Jingjing Wei
Jiquan Shen
author_facet Junwei Luo
Ziguang Zhu
Zhenhan Xu
Chuanle Xiao
Jingjing Wei
Jiquan Shen
author_sort Junwei Luo
collection DOAJ
description Abstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes in drug molecular graphs and dealing with complex structural molecules. In particular, when considering important nodes and complex substructures in molecules, they may not be able to fully explore the potential relationships between different parts. In addition, when dealing with protein structures, some methods ignore the connections between amino acid fragments that are far apart in sequence but may work synergistically in function. Results In this paper, we propose a new method, called GS-DTA, for predicting DTA based on graph and sequence models. GS-DTA takes simplified molecular input line input system (SMILES) of the drug and the protein amino acid sequence as input. First, each drug is modeled as a graph, in which a vertex is an atom and an edge represents interaction between atoms. Then GATv2-GCN and the three-layer GCN networks are used to extract the features of the drug. GATv2-GCN enhances the model’s ability to focus on important nodes by assigning dynamic attention scores, which improves the learning of the graph structure’s intricate patterns. Besides, The three-layer GCN can captures hierarchical features of the drug through deeper propagation and feature transformation. Meanwhile, for each protein, a framework combining CNN, Bi-LSTM, and Transformer is used to extract the contextual and structural information of the protein amino acid sequences, and this combination can help to understand a comprehensive and detailed features of the protein. Finally, the obtained drug and protein feature vectors are combined to predict DTA through the fully connected layer. The source code can be downloaded from https://github.com/zhuziguang/GS-DTA . Conclusions The results show that GS-DTA achieves good performance in terms of MSE, CI, and r2 m on the Davis and KIBA datasets, improving the accuracy of DTA prediction.
format Article
id doaj-art-db874f17d16446e0a3e7335efa37e14e
institution Kabale University
issn 1471-2164
language English
publishDate 2025-02-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj-art-db874f17d16446e0a3e7335efa37e14e2025-02-09T12:14:01ZengBMCBMC Genomics1471-21642025-02-0126111010.1186/s12864-025-11234-4GS-DTA: integrating graph and sequence models for predicting drug-target binding affinityJunwei Luo0Ziguang Zhu1Zhenhan Xu2Chuanle Xiao3Jingjing Wei4Jiquan Shen5School of Software, Henan Polytechnic UniversitySchool of Software, Henan Polytechnic UniversitySchool of Software, Henan Polytechnic UniversityState Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen UniversityCollege of Chemical and Environmental Engineering, Anyang Institute of TechnologySchool of Software, Henan Polytechnic UniversityAbstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes in drug molecular graphs and dealing with complex structural molecules. In particular, when considering important nodes and complex substructures in molecules, they may not be able to fully explore the potential relationships between different parts. In addition, when dealing with protein structures, some methods ignore the connections between amino acid fragments that are far apart in sequence but may work synergistically in function. Results In this paper, we propose a new method, called GS-DTA, for predicting DTA based on graph and sequence models. GS-DTA takes simplified molecular input line input system (SMILES) of the drug and the protein amino acid sequence as input. First, each drug is modeled as a graph, in which a vertex is an atom and an edge represents interaction between atoms. Then GATv2-GCN and the three-layer GCN networks are used to extract the features of the drug. GATv2-GCN enhances the model’s ability to focus on important nodes by assigning dynamic attention scores, which improves the learning of the graph structure’s intricate patterns. Besides, The three-layer GCN can captures hierarchical features of the drug through deeper propagation and feature transformation. Meanwhile, for each protein, a framework combining CNN, Bi-LSTM, and Transformer is used to extract the contextual and structural information of the protein amino acid sequences, and this combination can help to understand a comprehensive and detailed features of the protein. Finally, the obtained drug and protein feature vectors are combined to predict DTA through the fully connected layer. The source code can be downloaded from https://github.com/zhuziguang/GS-DTA . Conclusions The results show that GS-DTA achieves good performance in terms of MSE, CI, and r2 m on the Davis and KIBA datasets, improving the accuracy of DTA prediction.https://doi.org/10.1186/s12864-025-11234-4Drug-target binding affinityGraph neural networksTransformer
spellingShingle Junwei Luo
Ziguang Zhu
Zhenhan Xu
Chuanle Xiao
Jingjing Wei
Jiquan Shen
GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
BMC Genomics
Drug-target binding affinity
Graph neural networks
Transformer
title GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
title_full GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
title_fullStr GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
title_full_unstemmed GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
title_short GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
title_sort gs dta integrating graph and sequence models for predicting drug target binding affinity
topic Drug-target binding affinity
Graph neural networks
Transformer
url https://doi.org/10.1186/s12864-025-11234-4
work_keys_str_mv AT junweiluo gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity
AT ziguangzhu gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity
AT zhenhanxu gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity
AT chuanlexiao gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity
AT jingjingwei gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity
AT jiquanshen gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity