iVAE: an interpretable representation learning framework enhances clustering performance for single-cell data

Abstract Background Variational autoencoders (VAEs) serve as essential components in large generative models for extracting latent representations and have gained widespread application in biological domains. Developing VAEs specifically tailored to the unique characteristics of biological data is c...

Full description

Saved in:
Bibliographic Details
Main Authors: Zeyu Fu, Chunlin Chen, Song Wang, Junping Wang, Shilei Chen
Format: Article
Language:English
Published: BMC 2025-07-01
Series:BMC Biology
Subjects:
Online Access:https://doi.org/10.1186/s12915-025-02315-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Variational autoencoders (VAEs) serve as essential components in large generative models for extracting latent representations and have gained widespread application in biological domains. Developing VAEs specifically tailored to the unique characteristics of biological data is crucial for advancing future large-scale biological models. Results Through systematic monitoring of VAE training processes across 31 public single-cell datasets spanning oncological and normal conditions, we discovered that reducing the $$\beta$$ β value which corresponds to lower disentanglement of VAE significantly improves unsupervised clustering metrics in single-cell data analysis. Based on this finding, we innovatively developed iVAE with an irecon module that, when benchmarked against 8 established dimensionality reduction methods across 5 clustering performance metrics, exhibited superior capabilities in representing single-cell transcriptomic data. Conclusions The proposed iVAE architecture enhances the interpretability of single-cell data compared to conventional VAE architectures as measured by clustering metrics. Our work establishes a potential foundational VAE architecture for developing specialized large-scale generative models for biological applications. Graphical abstract
ISSN:1741-7007