Design and structure of overlapping regions in PCA via deep learning
Polymerase cycling assembly (PCA) stands out as the predominant method in the synthesis of kilobase-length DNA fragments. The design of overlapping regions is the core factor affecting the success rate of synthesis. However, there still exists DNA sequences that are challenging to design and constru...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co., Ltd.
2025-06-01
|
Series: | Synthetic and Systems Biotechnology |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405805X24001595 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832583869628416000 |
---|---|
author | Yan Zheng Xi-Chen Cui Fei Guo Ming-Liang Dou Ze-Xiong Xie Ying-Jin Yuan |
author_facet | Yan Zheng Xi-Chen Cui Fei Guo Ming-Liang Dou Ze-Xiong Xie Ying-Jin Yuan |
author_sort | Yan Zheng |
collection | DOAJ |
description | Polymerase cycling assembly (PCA) stands out as the predominant method in the synthesis of kilobase-length DNA fragments. The design of overlapping regions is the core factor affecting the success rate of synthesis. However, there still exists DNA sequences that are challenging to design and construct in the genome synthesis. Here we proposed a deep learning model based on extensive synthesis data to discern latent sequence representations in overlapping regions with an AUPR of 0.805. Utilizing the model, we developed the SmartCut algorithm aimed at designing oligonucleotides and enhancing the success rate of PCA experiments. This algorithm was successfully applied to sequences with diverse synthesis constraints, 80.4 % of which were synthesized in a single round. We further discovered structure differences represented by major groove width, stagger, slide, and centroid distance between overlapping and non-overlapping regions, which elucidated the model's reasonableness through the lens of physical chemistry. This comprehensive approach facilitates streamlined and efficient investigations into the genome synthesis. |
format | Article |
id | doaj-art-dd310c70e8bd45079add7baab5274c43 |
institution | Kabale University |
issn | 2405-805X |
language | English |
publishDate | 2025-06-01 |
publisher | KeAi Communications Co., Ltd. |
record_format | Article |
series | Synthetic and Systems Biotechnology |
spelling | doaj-art-dd310c70e8bd45079add7baab5274c432025-01-28T04:14:44ZengKeAi Communications Co., Ltd.Synthetic and Systems Biotechnology2405-805X2025-06-01102442451Design and structure of overlapping regions in PCA via deep learningYan Zheng0Xi-Chen Cui1Fei Guo2Ming-Liang Dou3Ze-Xiong Xie4Ying-Jin Yuan5Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China; School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR ChinaFrontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China; School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR ChinaFrontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China; School of Computer Science and Engineering, Central South University, Changsha, 410083, PR ChinaFrontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR ChinaFrontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China; School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China; Corresponding author. Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China.Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China; School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China; Corresponding author. Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, PR China.Polymerase cycling assembly (PCA) stands out as the predominant method in the synthesis of kilobase-length DNA fragments. The design of overlapping regions is the core factor affecting the success rate of synthesis. However, there still exists DNA sequences that are challenging to design and construct in the genome synthesis. Here we proposed a deep learning model based on extensive synthesis data to discern latent sequence representations in overlapping regions with an AUPR of 0.805. Utilizing the model, we developed the SmartCut algorithm aimed at designing oligonucleotides and enhancing the success rate of PCA experiments. This algorithm was successfully applied to sequences with diverse synthesis constraints, 80.4 % of which were synthesized in a single round. We further discovered structure differences represented by major groove width, stagger, slide, and centroid distance between overlapping and non-overlapping regions, which elucidated the model's reasonableness through the lens of physical chemistry. This comprehensive approach facilitates streamlined and efficient investigations into the genome synthesis.http://www.sciencedirect.com/science/article/pii/S2405805X24001595Synthetic biologyPCADeep learningMolecular dynamics |
spellingShingle | Yan Zheng Xi-Chen Cui Fei Guo Ming-Liang Dou Ze-Xiong Xie Ying-Jin Yuan Design and structure of overlapping regions in PCA via deep learning Synthetic and Systems Biotechnology Synthetic biology PCA Deep learning Molecular dynamics |
title | Design and structure of overlapping regions in PCA via deep learning |
title_full | Design and structure of overlapping regions in PCA via deep learning |
title_fullStr | Design and structure of overlapping regions in PCA via deep learning |
title_full_unstemmed | Design and structure of overlapping regions in PCA via deep learning |
title_short | Design and structure of overlapping regions in PCA via deep learning |
title_sort | design and structure of overlapping regions in pca via deep learning |
topic | Synthetic biology PCA Deep learning Molecular dynamics |
url | http://www.sciencedirect.com/science/article/pii/S2405805X24001595 |
work_keys_str_mv | AT yanzheng designandstructureofoverlappingregionsinpcaviadeeplearning AT xichencui designandstructureofoverlappingregionsinpcaviadeeplearning AT feiguo designandstructureofoverlappingregionsinpcaviadeeplearning AT mingliangdou designandstructureofoverlappingregionsinpcaviadeeplearning AT zexiongxie designandstructureofoverlappingregionsinpcaviadeeplearning AT yingjinyuan designandstructureofoverlappingregionsinpcaviadeeplearning |