Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours

Abstract Despite the current wealth of sequencing data, one‐third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and conseque...

Full description

Saved in:
Bibliographic Details
Main Authors: Takuji Yamada, Alison S Waller, Jeroen Raes, Aleksej Zelezniak, Nadia Perchat, Alain Perret, Marcel Salanoubat, Kiran R Patil, Jean Weissenbach, Peer Bork
Format: Article
Language:English
Published: Springer Nature 2012-05-01
Series:Molecular Systems Biology
Subjects:
Online Access:https://doi.org/10.1038/msb.2012.13
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849389479736901632
author Takuji Yamada
Alison S Waller
Jeroen Raes
Aleksej Zelezniak
Nadia Perchat
Alain Perret
Marcel Salanoubat
Kiran R Patil
Jean Weissenbach
Peer Bork
author_facet Takuji Yamada
Alison S Waller
Jeroen Raes
Aleksej Zelezniak
Nadia Perchat
Alain Perret
Marcel Salanoubat
Kiran R Patil
Jean Weissenbach
Peer Bork
author_sort Takuji Yamada
collection DOAJ
description Abstract Despite the current wealth of sequencing data, one‐third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome‐scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.
format Article
id doaj-art-fa24e5949f5f4ce1828fa5e67e126118
institution Kabale University
issn 1744-4292
language English
publishDate 2012-05-01
publisher Springer Nature
record_format Article
series Molecular Systems Biology
spelling doaj-art-fa24e5949f5f4ce1828fa5e67e1261182025-08-20T03:41:57ZengSpringer NatureMolecular Systems Biology1744-42922012-05-018111210.1038/msb.2012.13Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighboursTakuji Yamada0Alison S Waller1Jeroen Raes2Aleksej Zelezniak3Nadia Perchat4Alain Perret5Marcel Salanoubat6Kiran R Patil7Jean Weissenbach8Peer Bork9Structural and Computational Biology Unit, European Molecular Biology LaboratoryStructural and Computational Biology Unit, European Molecular Biology LaboratoryMolecular and Cellular Interactions Department, VIBStructural and Computational Biology Unit, European Molecular Biology LaboratoryCommissariat à l'Energie AtomiqueCommissariat à l'Energie AtomiqueCommissariat à l'Energie AtomiqueStructural and Computational Biology Unit, European Molecular Biology LaboratoryCommissariat à l'Energie AtomiqueStructural and Computational Biology Unit, European Molecular Biology LaboratoryAbstract Despite the current wealth of sequencing data, one‐third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome‐scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.https://doi.org/10.1038/msb.2012.13genomicsmetabolic pathwaysmetagenomicsneighbourhood informationorphan enzymes
spellingShingle Takuji Yamada
Alison S Waller
Jeroen Raes
Aleksej Zelezniak
Nadia Perchat
Alain Perret
Marcel Salanoubat
Kiran R Patil
Jean Weissenbach
Peer Bork
Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
Molecular Systems Biology
genomics
metabolic pathways
metagenomics
neighbourhood information
orphan enzymes
title Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
title_full Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
title_fullStr Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
title_full_unstemmed Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
title_short Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
title_sort prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
topic genomics
metabolic pathways
metagenomics
neighbourhood information
orphan enzymes
url https://doi.org/10.1038/msb.2012.13
work_keys_str_mv AT takujiyamada predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT alisonswaller predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT jeroenraes predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT aleksejzelezniak predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT nadiaperchat predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT alainperret predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT marcelsalanoubat predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT kiranrpatil predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT jeanweissenbach predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours
AT peerbork predictionandidentificationofsequencescodingfororphanenzymesusinggenomicandmetagenomicneighbours