A versatile CRISPR/Cas9 system off-target prediction tool using language model

Abstract Genome editing with the CRISPR/Cas9 system has revolutionized life and medical sciences, particularly in treating monogenic genetic diseases by enabling long-term therapeutic effects from a single intervention. However, the CRISPR/Cas9 system can tolerate mismatches and DNA/RNA bulges at ta...

Full description

Saved in:
Bibliographic Details
Main Authors: Weian Du, Liang Zhao, Kaichuan Diao, Yangyang Zheng, Qianyong Yang, Zhenzhen Zhu, Xiangxing Zhu, Dongsheng Tang
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Communications Biology
Online Access:https://doi.org/10.1038/s42003-025-08275-6
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Genome editing with the CRISPR/Cas9 system has revolutionized life and medical sciences, particularly in treating monogenic genetic diseases by enabling long-term therapeutic effects from a single intervention. However, the CRISPR/Cas9 system can tolerate mismatches and DNA/RNA bulges at target sites, leading to unintended off-target effects that pose challenges for gene-editing therapy development. Existing high-throughput detection and in silico prediction methods are often limited to specifically designed single guide RNAs (sgRNAs) and perform poorly on unseen sequences. To address these limitations, we introduce CCLMoff, a deep learning framework for off-target prediction that incorporates a pretrained RNA language model from RNAcentral. CCLMoff captures mutual sequence information between sgRNAs and target sites and is trained on a comprehensive, updated dataset. This approach enables accurate off-target identification and strong generalization across diverse NGS-based detection datasets. Model interpretation reveals the biological importance of the seed region, underscoring CCLMoff’s analytical capabilities. The development of CCLMoff lays the foundation for a comprehensive, end-to-end sgRNA design platform, enhancing both the precision and efficiency of CRISPR/Cas9-based therapeutics. CCLMoff is a versatile tool and is publicly available at github.com/duwa2/CCLMoff .
ISSN:2399-3642