CREATE: cell-type-specific cis-regulatory element identification via discrete embedding

Abstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Xuejian Cui, Qijin Yin, Zijing Gao, Zhen Li, Xiaoyang Chen, Hairong Lv, Shengquan Chen, Qiao Liu, Wanwen Zeng, Rui Jiang
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-59780-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849326890287890432
author Xuejian Cui
Qijin Yin
Zijing Gao
Zhen Li
Xiaoyang Chen
Hairong Lv
Shengquan Chen
Qiao Liu
Wanwen Zeng
Rui Jiang
author_facet Xuejian Cui
Qijin Yin
Zijing Gao
Zhen Li
Xiaoyang Chen
Hairong Lv
Shengquan Chen
Qiao Liu
Wanwen Zeng
Rui Jiang
author_sort Xuejian Cui
collection DOAJ
description Abstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically focus on individual CRE types, limiting insights into their cell-type-specific functions and regulatory dynamics. Here, we present CREATE, a multimodal deep learning framework based on Vector Quantized Variational AutoEncoder, tailored for comprehensive CRE identification and characterization. CREATE integrates genomic sequences, chromatin accessibility, and chromatin interaction data to generate discrete CRE embeddings, enabling accurate multi-class classification and robust characterization of CREs. CREATE excels in identifying cell-type-specific CREs, and provides quantitative and interpretable insights into CRE-specific features, uncovering the underlying regulatory codes. By facilitating large-scale prediction of CREs in specific cell types, CREATE enhances the recognition of disease- or phenotype-associated biological variabilities of CREs, thus advancing our understanding of gene regulatory landscapes and their roles in health and disease.
format Article
id doaj-art-e4081574c4104559b998e2c6ce21676d
institution Kabale University
issn 2041-1723
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-e4081574c4104559b998e2c6ce21676d2025-08-20T03:48:02ZengNature PortfolioNature Communications2041-17232025-05-0116111810.1038/s41467-025-59780-5CREATE: cell-type-specific cis-regulatory element identification via discrete embeddingXuejian Cui0Qijin Yin1Zijing Gao2Zhen Li3Xiaoyang Chen4Hairong Lv5Shengquan Chen6Qiao Liu7Wanwen Zeng8Rui Jiang9Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversitySchool of Mathematical Sciences and LPMC, Nankai UniversityDepartment of Statistics, Stanford UniversityDepartment of Statistics, Stanford UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityAbstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically focus on individual CRE types, limiting insights into their cell-type-specific functions and regulatory dynamics. Here, we present CREATE, a multimodal deep learning framework based on Vector Quantized Variational AutoEncoder, tailored for comprehensive CRE identification and characterization. CREATE integrates genomic sequences, chromatin accessibility, and chromatin interaction data to generate discrete CRE embeddings, enabling accurate multi-class classification and robust characterization of CREs. CREATE excels in identifying cell-type-specific CREs, and provides quantitative and interpretable insights into CRE-specific features, uncovering the underlying regulatory codes. By facilitating large-scale prediction of CREs in specific cell types, CREATE enhances the recognition of disease- or phenotype-associated biological variabilities of CREs, thus advancing our understanding of gene regulatory landscapes and their roles in health and disease.https://doi.org/10.1038/s41467-025-59780-5
spellingShingle Xuejian Cui
Qijin Yin
Zijing Gao
Zhen Li
Xiaoyang Chen
Hairong Lv
Shengquan Chen
Qiao Liu
Wanwen Zeng
Rui Jiang
CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
Nature Communications
title CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
title_full CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
title_fullStr CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
title_full_unstemmed CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
title_short CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
title_sort create cell type specific cis regulatory element identification via discrete embedding
url https://doi.org/10.1038/s41467-025-59780-5
work_keys_str_mv AT xuejiancui createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT qijinyin createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT zijinggao createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT zhenli createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT xiaoyangchen createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT haironglv createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT shengquanchen createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT qiaoliu createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT wanwenzeng createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding
AT ruijiang createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding