Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.

With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association...

Full description

Saved in:
Bibliographic Details
Main Authors: Sonja Lehtinen, Jon Lees, Jürg Bähler, John Shawe-Taylor, Christine Orengo
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0134668&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850161620172406784
author Sonja Lehtinen
Jon Lees
Jürg Bähler
John Shawe-Taylor
Christine Orengo
author_facet Sonja Lehtinen
Jon Lees
Jürg Bähler
John Shawe-Taylor
Christine Orengo
author_sort Sonja Lehtinen
collection DOAJ
description With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a commute-time kernel and partial least squares regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction.
format Article
id doaj-art-d54f6918fbcf4262b9bcdf5f08b1a384
institution OA Journals
issn 1932-6203
language English
publishDate 2015-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-d54f6918fbcf4262b9bcdf5f08b1a3842025-08-20T02:22:46ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01108e013466810.1371/journal.pone.0134668Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.Sonja LehtinenJon LeesJürg BählerJohn Shawe-TaylorChristine OrengoWith the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a commute-time kernel and partial least squares regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0134668&type=printable
spellingShingle Sonja Lehtinen
Jon Lees
Jürg Bähler
John Shawe-Taylor
Christine Orengo
Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
PLoS ONE
title Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
title_full Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
title_fullStr Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
title_full_unstemmed Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
title_short Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.
title_sort gene function prediction from functional association networks using kernel partial least squares regression
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0134668&type=printable
work_keys_str_mv AT sonjalehtinen genefunctionpredictionfromfunctionalassociationnetworksusingkernelpartialleastsquaresregression
AT jonlees genefunctionpredictionfromfunctionalassociationnetworksusingkernelpartialleastsquaresregression
AT jurgbahler genefunctionpredictionfromfunctionalassociationnetworksusingkernelpartialleastsquaresregression
AT johnshawetaylor genefunctionpredictionfromfunctionalassociationnetworksusingkernelpartialleastsquaresregression
AT christineorengo genefunctionpredictionfromfunctionalassociationnetworksusingkernelpartialleastsquaresregression