Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets

Abstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discov...

Full description

Saved in:
Bibliographic Details
Main Authors: Ricard Argelaguet, Britta Velten, Damien Arnol, Sascha Dietrich, Thorsten Zenz, John C Marioni, Florian Buettner, Wolfgang Huber, Oliver Stegle
Format: Article
Language:English
Published: Springer Nature 2018-06-01
Series:Molecular Systems Biology
Subjects:
Online Access:https://doi.org/10.15252/msb.20178124
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849738620731129856
author Ricard Argelaguet
Britta Velten
Damien Arnol
Sascha Dietrich
Thorsten Zenz
John C Marioni
Florian Buettner
Wolfgang Huber
Oliver Stegle
author_facet Ricard Argelaguet
Britta Velten
Damien Arnol
Sascha Dietrich
Thorsten Zenz
John C Marioni
Florian Buettner
Wolfgang Huber
Oliver Stegle
author_sort Ricard Argelaguet
collection DOAJ
description Abstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.
format Article
id doaj-art-1dc2c6e6d6284500bb61295fc5e4b496
institution DOAJ
issn 1744-4292
language English
publishDate 2018-06-01
publisher Springer Nature
record_format Article
series Molecular Systems Biology
spelling doaj-art-1dc2c6e6d6284500bb61295fc5e4b4962025-08-20T03:06:30ZengSpringer NatureMolecular Systems Biology1744-42922018-06-0114611310.15252/msb.20178124Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data setsRicard Argelaguet0Britta Velten1Damien Arnol2Sascha Dietrich3Thorsten Zenz4John C Marioni5Florian Buettner6Wolfgang Huber7Oliver Stegle8European Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory (EMBL)European Molecular Biology Laboratory, European Bioinformatics InstituteHeidelberg University HospitalHeidelberg University HospitalEuropean Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory (EMBL)European Molecular Biology Laboratory, European Bioinformatics InstituteAbstract Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.https://doi.org/10.15252/msb.20178124data integrationdimensionality reductionmulti‐omicspersonalized medicinesingle‐cell omics
spellingShingle Ricard Argelaguet
Britta Velten
Damien Arnol
Sascha Dietrich
Thorsten Zenz
John C Marioni
Florian Buettner
Wolfgang Huber
Oliver Stegle
Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
Molecular Systems Biology
data integration
dimensionality reduction
multi‐omics
personalized medicine
single‐cell omics
title Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_full Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_fullStr Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_full_unstemmed Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_short Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_sort multi omics factor analysis a framework for unsupervised integration of multi omics data sets
topic data integration
dimensionality reduction
multi‐omics
personalized medicine
single‐cell omics
url https://doi.org/10.15252/msb.20178124
work_keys_str_mv AT ricardargelaguet multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT brittavelten multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT damienarnol multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT saschadietrich multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT thorstenzenz multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT johncmarioni multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT florianbuettner multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT wolfganghuber multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT oliverstegle multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets