iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks.
Protein-DNA interactions play a crucial role in cellular biology, essential for maintaining life processes and regulating cellular functions. We propose a method called iProtDNA-SMOTE, which utilizes non-equilibrium graph neural networks along with pre-trained protein language models to predict DNA...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0320817 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849732815866822656 |
|---|---|
| author | Ruiyan Huang Wangren Qiu Xuan Xiao Weizhong Lin |
| author_facet | Ruiyan Huang Wangren Qiu Xuan Xiao Weizhong Lin |
| author_sort | Ruiyan Huang |
| collection | DOAJ |
| description | Protein-DNA interactions play a crucial role in cellular biology, essential for maintaining life processes and regulating cellular functions. We propose a method called iProtDNA-SMOTE, which utilizes non-equilibrium graph neural networks along with pre-trained protein language models to predict DNA binding residues. This approach effectively addresses the class imbalance issue in predicting protein-DNA binding sites by leveraging unbalanced graph data, thus enhancing model's generalization and specificity. We trained the model on two datasets, TR646 and TR573, and conducted a series of experiments to evaluate its performance. The model achieved AUC values of 0.850, 0.896, and 0.858 on the independent test datasets TE46, TE129, and TE181, respectively. These results indicate that iProtDNA-SMOTE outperforms existing methods in terms of accuracy and generalization for predicting DNA binding sites, offering reliable and effective predictions to minimize errors. The model has been thoroughly validated for its ability to predict protein-DNA binding sites with high reliability and precision. For the convenience of the scientific community, the benchmark datasets and codes are publicly available at https://github.com/primrosehry/iProtDNA-SMOTE. |
| format | Article |
| id | doaj-art-2ce4ae37841d4c39aef2fd9ba5acc08f |
| institution | DOAJ |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-2ce4ae37841d4c39aef2fd9ba5acc08f2025-08-20T03:08:13ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032081710.1371/journal.pone.0320817iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks.Ruiyan HuangWangren QiuXuan XiaoWeizhong LinProtein-DNA interactions play a crucial role in cellular biology, essential for maintaining life processes and regulating cellular functions. We propose a method called iProtDNA-SMOTE, which utilizes non-equilibrium graph neural networks along with pre-trained protein language models to predict DNA binding residues. This approach effectively addresses the class imbalance issue in predicting protein-DNA binding sites by leveraging unbalanced graph data, thus enhancing model's generalization and specificity. We trained the model on two datasets, TR646 and TR573, and conducted a series of experiments to evaluate its performance. The model achieved AUC values of 0.850, 0.896, and 0.858 on the independent test datasets TE46, TE129, and TE181, respectively. These results indicate that iProtDNA-SMOTE outperforms existing methods in terms of accuracy and generalization for predicting DNA binding sites, offering reliable and effective predictions to minimize errors. The model has been thoroughly validated for its ability to predict protein-DNA binding sites with high reliability and precision. For the convenience of the scientific community, the benchmark datasets and codes are publicly available at https://github.com/primrosehry/iProtDNA-SMOTE.https://doi.org/10.1371/journal.pone.0320817 |
| spellingShingle | Ruiyan Huang Wangren Qiu Xuan Xiao Weizhong Lin iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. PLoS ONE |
| title | iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. |
| title_full | iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. |
| title_fullStr | iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. |
| title_full_unstemmed | iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. |
| title_short | iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks. |
| title_sort | iprotdna smote enhancing protein dna binding sites prediction through imbalanced graph neural networks |
| url | https://doi.org/10.1371/journal.pone.0320817 |
| work_keys_str_mv | AT ruiyanhuang iprotdnasmoteenhancingproteindnabindingsitespredictionthroughimbalancedgraphneuralnetworks AT wangrenqiu iprotdnasmoteenhancingproteindnabindingsitespredictionthroughimbalancedgraphneuralnetworks AT xuanxiao iprotdnasmoteenhancingproteindnabindingsitespredictionthroughimbalancedgraphneuralnetworks AT weizhonglin iprotdnasmoteenhancingproteindnabindingsitespredictionthroughimbalancedgraphneuralnetworks |