GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity
Abstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-02-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-025-11234-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825197683327893504 |
---|---|
author | Junwei Luo Ziguang Zhu Zhenhan Xu Chuanle Xiao Jingjing Wei Jiquan Shen |
author_facet | Junwei Luo Ziguang Zhu Zhenhan Xu Chuanle Xiao Jingjing Wei Jiquan Shen |
author_sort | Junwei Luo |
collection | DOAJ |
description | Abstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes in drug molecular graphs and dealing with complex structural molecules. In particular, when considering important nodes and complex substructures in molecules, they may not be able to fully explore the potential relationships between different parts. In addition, when dealing with protein structures, some methods ignore the connections between amino acid fragments that are far apart in sequence but may work synergistically in function. Results In this paper, we propose a new method, called GS-DTA, for predicting DTA based on graph and sequence models. GS-DTA takes simplified molecular input line input system (SMILES) of the drug and the protein amino acid sequence as input. First, each drug is modeled as a graph, in which a vertex is an atom and an edge represents interaction between atoms. Then GATv2-GCN and the three-layer GCN networks are used to extract the features of the drug. GATv2-GCN enhances the model’s ability to focus on important nodes by assigning dynamic attention scores, which improves the learning of the graph structure’s intricate patterns. Besides, The three-layer GCN can captures hierarchical features of the drug through deeper propagation and feature transformation. Meanwhile, for each protein, a framework combining CNN, Bi-LSTM, and Transformer is used to extract the contextual and structural information of the protein amino acid sequences, and this combination can help to understand a comprehensive and detailed features of the protein. Finally, the obtained drug and protein feature vectors are combined to predict DTA through the fully connected layer. The source code can be downloaded from https://github.com/zhuziguang/GS-DTA . Conclusions The results show that GS-DTA achieves good performance in terms of MSE, CI, and r2 m on the Davis and KIBA datasets, improving the accuracy of DTA prediction. |
format | Article |
id | doaj-art-db874f17d16446e0a3e7335efa37e14e |
institution | Kabale University |
issn | 1471-2164 |
language | English |
publishDate | 2025-02-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj-art-db874f17d16446e0a3e7335efa37e14e2025-02-09T12:14:01ZengBMCBMC Genomics1471-21642025-02-0126111010.1186/s12864-025-11234-4GS-DTA: integrating graph and sequence models for predicting drug-target binding affinityJunwei Luo0Ziguang Zhu1Zhenhan Xu2Chuanle Xiao3Jingjing Wei4Jiquan Shen5School of Software, Henan Polytechnic UniversitySchool of Software, Henan Polytechnic UniversitySchool of Software, Henan Polytechnic UniversityState Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen UniversityCollege of Chemical and Environmental Engineering, Anyang Institute of TechnologySchool of Software, Henan Polytechnic UniversityAbstract Background Drug-target binding affinity (DTA) prediction is vital in drug discovery and repositioning, more and more researchers are beginning to focus on this. Many effective methods have been proposed. However, some current methods have certain shortcomings in focusing on important nodes in drug molecular graphs and dealing with complex structural molecules. In particular, when considering important nodes and complex substructures in molecules, they may not be able to fully explore the potential relationships between different parts. In addition, when dealing with protein structures, some methods ignore the connections between amino acid fragments that are far apart in sequence but may work synergistically in function. Results In this paper, we propose a new method, called GS-DTA, for predicting DTA based on graph and sequence models. GS-DTA takes simplified molecular input line input system (SMILES) of the drug and the protein amino acid sequence as input. First, each drug is modeled as a graph, in which a vertex is an atom and an edge represents interaction between atoms. Then GATv2-GCN and the three-layer GCN networks are used to extract the features of the drug. GATv2-GCN enhances the model’s ability to focus on important nodes by assigning dynamic attention scores, which improves the learning of the graph structure’s intricate patterns. Besides, The three-layer GCN can captures hierarchical features of the drug through deeper propagation and feature transformation. Meanwhile, for each protein, a framework combining CNN, Bi-LSTM, and Transformer is used to extract the contextual and structural information of the protein amino acid sequences, and this combination can help to understand a comprehensive and detailed features of the protein. Finally, the obtained drug and protein feature vectors are combined to predict DTA through the fully connected layer. The source code can be downloaded from https://github.com/zhuziguang/GS-DTA . Conclusions The results show that GS-DTA achieves good performance in terms of MSE, CI, and r2 m on the Davis and KIBA datasets, improving the accuracy of DTA prediction.https://doi.org/10.1186/s12864-025-11234-4Drug-target binding affinityGraph neural networksTransformer |
spellingShingle | Junwei Luo Ziguang Zhu Zhenhan Xu Chuanle Xiao Jingjing Wei Jiquan Shen GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity BMC Genomics Drug-target binding affinity Graph neural networks Transformer |
title | GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity |
title_full | GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity |
title_fullStr | GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity |
title_full_unstemmed | GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity |
title_short | GS-DTA: integrating graph and sequence models for predicting drug-target binding affinity |
title_sort | gs dta integrating graph and sequence models for predicting drug target binding affinity |
topic | Drug-target binding affinity Graph neural networks Transformer |
url | https://doi.org/10.1186/s12864-025-11234-4 |
work_keys_str_mv | AT junweiluo gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity AT ziguangzhu gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity AT zhenhanxu gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity AT chuanlexiao gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity AT jingjingwei gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity AT jiquanshen gsdtaintegratinggraphandsequencemodelsforpredictingdrugtargetbindingaffinity |