A versatile CRISPR/Cas9 system off-target prediction tool using language model
Abstract Genome editing with the CRISPR/Cas9 system has revolutionized life and medical sciences, particularly in treating monogenic genetic diseases by enabling long-term therapeutic effects from a single intervention. However, the CRISPR/Cas9 system can tolerate mismatches and DNA/RNA bulges at ta...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-06-01
|
| Series: | Communications Biology |
| Online Access: | https://doi.org/10.1038/s42003-025-08275-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849434147975593984 |
|---|---|
| author | Weian Du Liang Zhao Kaichuan Diao Yangyang Zheng Qianyong Yang Zhenzhen Zhu Xiangxing Zhu Dongsheng Tang |
| author_facet | Weian Du Liang Zhao Kaichuan Diao Yangyang Zheng Qianyong Yang Zhenzhen Zhu Xiangxing Zhu Dongsheng Tang |
| author_sort | Weian Du |
| collection | DOAJ |
| description | Abstract Genome editing with the CRISPR/Cas9 system has revolutionized life and medical sciences, particularly in treating monogenic genetic diseases by enabling long-term therapeutic effects from a single intervention. However, the CRISPR/Cas9 system can tolerate mismatches and DNA/RNA bulges at target sites, leading to unintended off-target effects that pose challenges for gene-editing therapy development. Existing high-throughput detection and in silico prediction methods are often limited to specifically designed single guide RNAs (sgRNAs) and perform poorly on unseen sequences. To address these limitations, we introduce CCLMoff, a deep learning framework for off-target prediction that incorporates a pretrained RNA language model from RNAcentral. CCLMoff captures mutual sequence information between sgRNAs and target sites and is trained on a comprehensive, updated dataset. This approach enables accurate off-target identification and strong generalization across diverse NGS-based detection datasets. Model interpretation reveals the biological importance of the seed region, underscoring CCLMoff’s analytical capabilities. The development of CCLMoff lays the foundation for a comprehensive, end-to-end sgRNA design platform, enhancing both the precision and efficiency of CRISPR/Cas9-based therapeutics. CCLMoff is a versatile tool and is publicly available at github.com/duwa2/CCLMoff . |
| format | Article |
| id | doaj-art-94394bd412844c9ea4035b7bb00af13f |
| institution | Kabale University |
| issn | 2399-3642 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Communications Biology |
| spelling | doaj-art-94394bd412844c9ea4035b7bb00af13f2025-08-20T03:26:47ZengNature PortfolioCommunications Biology2399-36422025-06-018111010.1038/s42003-025-08275-6A versatile CRISPR/Cas9 system off-target prediction tool using language modelWeian Du0Liang Zhao1Kaichuan Diao2Yangyang Zheng3Qianyong Yang4Zhenzhen Zhu5Xiangxing Zhu6Dongsheng Tang7Gene Editing Technology Center of Guangdong Province, School of Medicine, Foshan UniversityShenzhen Health Development Research and Data Management CenterShenzhen Center for Chronic Disease ControlGuangdong Homy Genetics LtdJiujiang Key Laboratory of Rare Disease Research, Jiujiang UniversityShenzhen Health Development Research and Data Management CenterGene Editing Technology Center of Guangdong Province, School of Medicine, Foshan UniversityGene Editing Technology Center of Guangdong Province, School of Medicine, Foshan UniversityAbstract Genome editing with the CRISPR/Cas9 system has revolutionized life and medical sciences, particularly in treating monogenic genetic diseases by enabling long-term therapeutic effects from a single intervention. However, the CRISPR/Cas9 system can tolerate mismatches and DNA/RNA bulges at target sites, leading to unintended off-target effects that pose challenges for gene-editing therapy development. Existing high-throughput detection and in silico prediction methods are often limited to specifically designed single guide RNAs (sgRNAs) and perform poorly on unseen sequences. To address these limitations, we introduce CCLMoff, a deep learning framework for off-target prediction that incorporates a pretrained RNA language model from RNAcentral. CCLMoff captures mutual sequence information between sgRNAs and target sites and is trained on a comprehensive, updated dataset. This approach enables accurate off-target identification and strong generalization across diverse NGS-based detection datasets. Model interpretation reveals the biological importance of the seed region, underscoring CCLMoff’s analytical capabilities. The development of CCLMoff lays the foundation for a comprehensive, end-to-end sgRNA design platform, enhancing both the precision and efficiency of CRISPR/Cas9-based therapeutics. CCLMoff is a versatile tool and is publicly available at github.com/duwa2/CCLMoff .https://doi.org/10.1038/s42003-025-08275-6 |
| spellingShingle | Weian Du Liang Zhao Kaichuan Diao Yangyang Zheng Qianyong Yang Zhenzhen Zhu Xiangxing Zhu Dongsheng Tang A versatile CRISPR/Cas9 system off-target prediction tool using language model Communications Biology |
| title | A versatile CRISPR/Cas9 system off-target prediction tool using language model |
| title_full | A versatile CRISPR/Cas9 system off-target prediction tool using language model |
| title_fullStr | A versatile CRISPR/Cas9 system off-target prediction tool using language model |
| title_full_unstemmed | A versatile CRISPR/Cas9 system off-target prediction tool using language model |
| title_short | A versatile CRISPR/Cas9 system off-target prediction tool using language model |
| title_sort | versatile crispr cas9 system off target prediction tool using language model |
| url | https://doi.org/10.1038/s42003-025-08275-6 |
| work_keys_str_mv | AT weiandu aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT liangzhao aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT kaichuandiao aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT yangyangzheng aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT qianyongyang aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT zhenzhenzhu aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT xiangxingzhu aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT dongshengtang aversatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT weiandu versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT liangzhao versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT kaichuandiao versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT yangyangzheng versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT qianyongyang versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT zhenzhenzhu versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT xiangxingzhu versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel AT dongshengtang versatilecrisprcas9systemofftargetpredictiontoolusinglanguagemodel |