Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather tha...

Full description

Saved in:
Bibliographic Details
Main Authors: Huiying Zhao, Jihua Wang, Yaoqi Zhou, Yuedong Yang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0096694&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850023468037308416
author Huiying Zhao
Jihua Wang
Yaoqi Zhou
Yuedong Yang
author_facet Huiying Zhao
Jihua Wang
Yaoqi Zhou
Yuedong Yang
author_sort Huiying Zhao
collection DOAJ
description As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.
format Article
id doaj-art-3f00eae6a8b14e32b34c1d47d58e0aea
institution DOAJ
issn 1932-6203
language English
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-3f00eae6a8b14e32b34c1d47d58e0aea2025-08-20T03:01:22ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0195e9669410.1371/journal.pone.0096694Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.Huiying ZhaoJihua WangYaoqi ZhouYuedong YangAs more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0096694&type=printable
spellingShingle Huiying Zhao
Jihua Wang
Yaoqi Zhou
Yuedong Yang
Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
PLoS ONE
title Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
title_full Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
title_fullStr Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
title_full_unstemmed Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
title_short Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.
title_sort predicting dna binding proteins and binding residues by complex structure prediction and application to human proteome
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0096694&type=printable
work_keys_str_mv AT huiyingzhao predictingdnabindingproteinsandbindingresiduesbycomplexstructurepredictionandapplicationtohumanproteome
AT jihuawang predictingdnabindingproteinsandbindingresiduesbycomplexstructurepredictionandapplicationtohumanproteome
AT yaoqizhou predictingdnabindingproteinsandbindingresiduesbycomplexstructurepredictionandapplicationtohumanproteome
AT yuedongyang predictingdnabindingproteinsandbindingresiduesbycomplexstructurepredictionandapplicationtohumanproteome