De novo structure prediction of globular proteins aided by sequence variation-derived contacts.

The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm--FRAGFOLD, with PSICOV, a con...

Full description

Saved in:
Bibliographic Details
Main Authors: Tomasz Kosciolek, David T Jones
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0092197
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849332390926745600
author Tomasz Kosciolek
David T Jones
author_facet Tomasz Kosciolek
David T Jones
author_sort Tomasz Kosciolek
collection DOAJ
description The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm--FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step.
format Article
id doaj-art-a28adaaa9e2840c09273da053b3be7de
institution Kabale University
issn 1932-6203
language English
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-a28adaaa9e2840c09273da053b3be7de2025-08-20T03:46:12ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e9219710.1371/journal.pone.0092197De novo structure prediction of globular proteins aided by sequence variation-derived contacts.Tomasz KosciolekDavid T JonesThe advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm--FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step.https://doi.org/10.1371/journal.pone.0092197
spellingShingle Tomasz Kosciolek
David T Jones
De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
PLoS ONE
title De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
title_full De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
title_fullStr De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
title_full_unstemmed De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
title_short De novo structure prediction of globular proteins aided by sequence variation-derived contacts.
title_sort de novo structure prediction of globular proteins aided by sequence variation derived contacts
url https://doi.org/10.1371/journal.pone.0092197
work_keys_str_mv AT tomaszkosciolek denovostructurepredictionofglobularproteinsaidedbysequencevariationderivedcontacts
AT davidtjones denovostructurepredictionofglobularproteinsaidedbysequencevariationderivedcontacts