CREATE: cell-type-specific cis-regulatory element identification via discrete embedding
Abstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically fo...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-59780-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849326890287890432 |
|---|---|
| author | Xuejian Cui Qijin Yin Zijing Gao Zhen Li Xiaoyang Chen Hairong Lv Shengquan Chen Qiao Liu Wanwen Zeng Rui Jiang |
| author_facet | Xuejian Cui Qijin Yin Zijing Gao Zhen Li Xiaoyang Chen Hairong Lv Shengquan Chen Qiao Liu Wanwen Zeng Rui Jiang |
| author_sort | Xuejian Cui |
| collection | DOAJ |
| description | Abstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically focus on individual CRE types, limiting insights into their cell-type-specific functions and regulatory dynamics. Here, we present CREATE, a multimodal deep learning framework based on Vector Quantized Variational AutoEncoder, tailored for comprehensive CRE identification and characterization. CREATE integrates genomic sequences, chromatin accessibility, and chromatin interaction data to generate discrete CRE embeddings, enabling accurate multi-class classification and robust characterization of CREs. CREATE excels in identifying cell-type-specific CREs, and provides quantitative and interpretable insights into CRE-specific features, uncovering the underlying regulatory codes. By facilitating large-scale prediction of CREs in specific cell types, CREATE enhances the recognition of disease- or phenotype-associated biological variabilities of CREs, thus advancing our understanding of gene regulatory landscapes and their roles in health and disease. |
| format | Article |
| id | doaj-art-e4081574c4104559b998e2c6ce21676d |
| institution | Kabale University |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-e4081574c4104559b998e2c6ce21676d2025-08-20T03:48:02ZengNature PortfolioNature Communications2041-17232025-05-0116111810.1038/s41467-025-59780-5CREATE: cell-type-specific cis-regulatory element identification via discrete embeddingXuejian Cui0Qijin Yin1Zijing Gao2Zhen Li3Xiaoyang Chen4Hairong Lv5Shengquan Chen6Qiao Liu7Wanwen Zeng8Rui Jiang9Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversitySchool of Mathematical Sciences and LPMC, Nankai UniversityDepartment of Statistics, Stanford UniversityDepartment of Statistics, Stanford UniversityMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua UniversityAbstract Cis-regulatory elements (CREs), including enhancers, silencers, promoters and insulators, play pivotal roles in orchestrating gene regulatory mechanisms that drive complex biological traits. However, current approaches for CRE identification are predominantly sequence-based and typically focus on individual CRE types, limiting insights into their cell-type-specific functions and regulatory dynamics. Here, we present CREATE, a multimodal deep learning framework based on Vector Quantized Variational AutoEncoder, tailored for comprehensive CRE identification and characterization. CREATE integrates genomic sequences, chromatin accessibility, and chromatin interaction data to generate discrete CRE embeddings, enabling accurate multi-class classification and robust characterization of CREs. CREATE excels in identifying cell-type-specific CREs, and provides quantitative and interpretable insights into CRE-specific features, uncovering the underlying regulatory codes. By facilitating large-scale prediction of CREs in specific cell types, CREATE enhances the recognition of disease- or phenotype-associated biological variabilities of CREs, thus advancing our understanding of gene regulatory landscapes and their roles in health and disease.https://doi.org/10.1038/s41467-025-59780-5 |
| spellingShingle | Xuejian Cui Qijin Yin Zijing Gao Zhen Li Xiaoyang Chen Hairong Lv Shengquan Chen Qiao Liu Wanwen Zeng Rui Jiang CREATE: cell-type-specific cis-regulatory element identification via discrete embedding Nature Communications |
| title | CREATE: cell-type-specific cis-regulatory element identification via discrete embedding |
| title_full | CREATE: cell-type-specific cis-regulatory element identification via discrete embedding |
| title_fullStr | CREATE: cell-type-specific cis-regulatory element identification via discrete embedding |
| title_full_unstemmed | CREATE: cell-type-specific cis-regulatory element identification via discrete embedding |
| title_short | CREATE: cell-type-specific cis-regulatory element identification via discrete embedding |
| title_sort | create cell type specific cis regulatory element identification via discrete embedding |
| url | https://doi.org/10.1038/s41467-025-59780-5 |
| work_keys_str_mv | AT xuejiancui createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT qijinyin createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT zijinggao createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT zhenli createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT xiaoyangchen createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT haironglv createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT shengquanchen createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT qiaoliu createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT wanwenzeng createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding AT ruijiang createcelltypespecificcisregulatoryelementidentificationviadiscreteembedding |