A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data.
A common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e.,...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2018-05-01
|
| Series: | PLoS Computational Biology |
| Online Access: | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006105&type=printable |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850230186638835712 |
|---|---|
| author | Aaditya V Rangan Caroline C McGrouther John Kelsoe Nicholas Schork Eli Stahl Qian Zhu Arjun Krishnan Vicky Yao Olga Troyanskaya Seda Bilaloglu Preeti Raghavan Sarah Bergen Anders Jureus Mikael Landen Bipolar Disorders Working Group of the Psychiatric Genomics Consortium |
| author_facet | Aaditya V Rangan Caroline C McGrouther John Kelsoe Nicholas Schork Eli Stahl Qian Zhu Arjun Krishnan Vicky Yao Olga Troyanskaya Seda Bilaloglu Preeti Raghavan Sarah Bergen Anders Jureus Mikael Landen Bipolar Disorders Working Group of the Psychiatric Genomics Consortium |
| author_sort | Aaditya V Rangan |
| collection | DOAJ |
| description | A common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e., 'loops') within the data-matrix, and focuses on rows and columns of the data-matrix that participate in an abundance of low-rank loops. We demonstrate, through analysis and numerical-experiments, that this loop-counting method performs well in a variety of scenarios, outperforming simple spectral methods in many situations of interest. Another important feature of our method is that it can easily be modified to account for aspects of experimental design which commonly arise in practice. For example, our algorithm can be modified to correct for controls, categorical- and continuous-covariates, as well as sparsity within the data. We demonstrate these practical features with two examples; the first drawn from gene-expression analysis and the second drawn from a much larger genome-wide-association-study (GWAS). |
| format | Article |
| id | doaj-art-3516fd8d65ac40969c8cda7731788d4e |
| institution | OA Journals |
| issn | 1553-734X 1553-7358 |
| language | English |
| publishDate | 2018-05-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS Computational Biology |
| spelling | doaj-art-3516fd8d65ac40969c8cda7731788d4e2025-08-20T02:03:57ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582018-05-01145e100610510.1371/journal.pcbi.1006105A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data.Aaditya V RanganCaroline C McGroutherJohn KelsoeNicholas SchorkEli StahlQian ZhuArjun KrishnanVicky YaoOlga TroyanskayaSeda BilalogluPreeti RaghavanSarah BergenAnders JureusMikael LandenBipolar Disorders Working Group of the Psychiatric Genomics ConsortiumA common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e., 'loops') within the data-matrix, and focuses on rows and columns of the data-matrix that participate in an abundance of low-rank loops. We demonstrate, through analysis and numerical-experiments, that this loop-counting method performs well in a variety of scenarios, outperforming simple spectral methods in many situations of interest. Another important feature of our method is that it can easily be modified to account for aspects of experimental design which commonly arise in practice. For example, our algorithm can be modified to correct for controls, categorical- and continuous-covariates, as well as sparsity within the data. We demonstrate these practical features with two examples; the first drawn from gene-expression analysis and the second drawn from a much larger genome-wide-association-study (GWAS).https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006105&type=printable |
| spellingShingle | Aaditya V Rangan Caroline C McGrouther John Kelsoe Nicholas Schork Eli Stahl Qian Zhu Arjun Krishnan Vicky Yao Olga Troyanskaya Seda Bilaloglu Preeti Raghavan Sarah Bergen Anders Jureus Mikael Landen Bipolar Disorders Working Group of the Psychiatric Genomics Consortium A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. PLoS Computational Biology |
| title | A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. |
| title_full | A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. |
| title_fullStr | A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. |
| title_full_unstemmed | A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. |
| title_short | A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. |
| title_sort | loop counting method for covariate corrected low rank biclustering of gene expression and genome wide association study data |
| url | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006105&type=printable |
| work_keys_str_mv | AT aadityavrangan aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT carolinecmcgrouther aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT johnkelsoe aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT nicholasschork aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT elistahl aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT qianzhu aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT arjunkrishnan aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT vickyyao aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT olgatroyanskaya aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT sedabilaloglu aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT preetiraghavan aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT sarahbergen aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT andersjureus aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT mikaellanden aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT bipolardisordersworkinggroupofthepsychiatricgenomicsconsortium aloopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT aadityavrangan loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT carolinecmcgrouther loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT johnkelsoe loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT nicholasschork loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT elistahl loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT qianzhu loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT arjunkrishnan loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT vickyyao loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT olgatroyanskaya loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT sedabilaloglu loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT preetiraghavan loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT sarahbergen loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT andersjureus loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT mikaellanden loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata AT bipolardisordersworkinggroupofthepsychiatricgenomicsconsortium loopcountingmethodforcovariatecorrectedlowrankbiclusteringofgeneexpressionandgenomewideassociationstudydata |