Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
Abstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discov...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer Nature
2018-06-01
|
| Series: | Molecular Systems Biology |
| Subjects: | |
| Online Access: | https://doi.org/10.15252/msb.20178124 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849738620731129856 |
|---|---|
| author | Ricard Argelaguet Britta Velten Damien Arnol Sascha Dietrich Thorsten Zenz John C Marioni Florian Buettner Wolfgang Huber Oliver Stegle |
| author_facet | Ricard Argelaguet Britta Velten Damien Arnol Sascha Dietrich Thorsten Zenz John C Marioni Florian Buettner Wolfgang Huber Oliver Stegle |
| author_sort | Ricard Argelaguet |
| collection | DOAJ |
| description | Abstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation. |
| format | Article |
| id | doaj-art-1dc2c6e6d6284500bb61295fc5e4b496 |
| institution | DOAJ |
| issn | 1744-4292 |
| language | English |
| publishDate | 2018-06-01 |
| publisher | Springer Nature |
| record_format | Article |
| series | Molecular Systems Biology |
| spelling | doaj-art-1dc2c6e6d6284500bb61295fc5e4b4962025-08-20T03:06:30ZengSpringer NatureMolecular Systems Biology1744-42922018-06-0114611310.15252/msb.20178124Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data setsRicard Argelaguet0Britta Velten1Damien Arnol2Sascha Dietrich3Thorsten Zenz4John C Marioni5Florian Buettner6Wolfgang Huber7Oliver Stegle8European Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory (EMBL)European Molecular Biology Laboratory, European Bioinformatics InstituteHeidelberg University HospitalHeidelberg University HospitalEuropean Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory (EMBL)European Molecular Biology Laboratory, European Bioinformatics InstituteAbstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.https://doi.org/10.15252/msb.20178124data integrationdimensionality reductionmulti‐omicspersonalized medicinesingle‐cell omics |
| spellingShingle | Ricard Argelaguet Britta Velten Damien Arnol Sascha Dietrich Thorsten Zenz John C Marioni Florian Buettner Wolfgang Huber Oliver Stegle Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets Molecular Systems Biology data integration dimensionality reduction multi‐omics personalized medicine single‐cell omics |
| title | Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets |
| title_full | Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets |
| title_fullStr | Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets |
| title_full_unstemmed | Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets |
| title_short | Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets |
| title_sort | multi omics factor analysis a framework for unsupervised integration of multi omics data sets |
| topic | data integration dimensionality reduction multi‐omics personalized medicine single‐cell omics |
| url | https://doi.org/10.15252/msb.20178124 |
| work_keys_str_mv | AT ricardargelaguet multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT brittavelten multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT damienarnol multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT saschadietrich multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT thorstenzenz multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT johncmarioni multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT florianbuettner multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT wolfganghuber multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets AT oliverstegle multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets |