MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles

Abstract High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biological insights with current methods is di...

Full description

Saved in:
Bibliographic Details
Main Authors: Salvatore Milite, Giulio Caravagna, Andrea Sottoriva
Format: Article
Language:English
Published: BMC 2025-04-01
Series:Genome Biology
Online Access:https://doi.org/10.1186/s13059-025-03530-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850183819562319872
author Salvatore Milite
Giulio Caravagna
Andrea Sottoriva
author_facet Salvatore Milite
Giulio Caravagna
Andrea Sottoriva
author_sort Salvatore Milite
collection DOAJ
description Abstract High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biological insights with current methods is difficult because they are not rooted in biological principles but prioritise tasks like dimensionality reduction. Here, we introduce a framework that combines archetypal analysis, an approach grounded in biological principles, with deep learning. Using archetypes based on evolutionary trade-offs and Pareto optimality, MIDAA finds extreme data points that define the geometry of the latent space, preserving the complexity of biological interactions while retaining an interpretable output. We demonstrate that these extreme points represent cellular programmes reflecting the underlying biology. Moreover, we show that, compared to alternative methods, MIDAA can identify parsimonious, interpretable, and biologically relevant patterns from real and simulated multi-omics.
format Article
id doaj-art-da2c9afe438345898dd696ca7f53b009
institution OA Journals
issn 1474-760X
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj-art-da2c9afe438345898dd696ca7f53b0092025-08-20T02:17:13ZengBMCGenome Biology1474-760X2025-04-0126111610.1186/s13059-025-03530-9MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principlesSalvatore Milite0Giulio Caravagna1Andrea Sottoriva2Computational Biology Research Centre, Human TechnopoleDepartment of Mathematics, Informatics and Geosciences, University of TriesteComputational Biology Research Centre, Human TechnopoleAbstract High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biological insights with current methods is difficult because they are not rooted in biological principles but prioritise tasks like dimensionality reduction. Here, we introduce a framework that combines archetypal analysis, an approach grounded in biological principles, with deep learning. Using archetypes based on evolutionary trade-offs and Pareto optimality, MIDAA finds extreme data points that define the geometry of the latent space, preserving the complexity of biological interactions while retaining an interpretable output. We demonstrate that these extreme points represent cellular programmes reflecting the underlying biology. Moreover, we show that, compared to alternative methods, MIDAA can identify parsimonious, interpretable, and biologically relevant patterns from real and simulated multi-omics.https://doi.org/10.1186/s13059-025-03530-9
spellingShingle Salvatore Milite
Giulio Caravagna
Andrea Sottoriva
MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
Genome Biology
title MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
title_full MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
title_fullStr MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
title_full_unstemmed MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
title_short MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles
title_sort midaa deep archetypal analysis for interpretable multi omic data integration based on biological principles
url https://doi.org/10.1186/s13059-025-03530-9
work_keys_str_mv AT salvatoremilite midaadeeparchetypalanalysisforinterpretablemultiomicdataintegrationbasedonbiologicalprinciples
AT giuliocaravagna midaadeeparchetypalanalysisforinterpretablemultiomicdataintegrationbasedonbiologicalprinciples
AT andreasottoriva midaadeeparchetypalanalysisforinterpretablemultiomicdataintegrationbasedonbiologicalprinciples