Single-step retrosynthesis prediction via multitask graph representation learning
Abstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; templat...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-025-56062-y |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594608172826624 |
---|---|
author | Peng-Cheng Zhao Xue-Xin Wei Qiong Wang Qi-Hao Wang Jia-Ning Li Jie Shang Cheng Lu Jian-Yu Shi |
author_facet | Peng-Cheng Zhao Xue-Xin Wei Qiong Wang Qi-Hao Wang Jia-Ning Li Jie Shang Cheng Lu Jian-Yu Shi |
author_sort | Peng-Cheng Zhao |
collection | DOAJ |
description | Abstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324 . |
format | Article |
id | doaj-art-2ccca88d196241efb9889a4502da9e1f |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-2ccca88d196241efb9889a4502da9e1f2025-01-19T12:30:08ZengNature PortfolioNature Communications2041-17232025-01-0116111910.1038/s41467-025-56062-ySingle-step retrosynthesis prediction via multitask graph representation learningPeng-Cheng Zhao0Xue-Xin Wei1Qiong Wang2Qi-Hao Wang3Jia-Ning Li4Jie Shang5Cheng Lu6Jian-Yu Shi7School of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Chemistry and Chemical Engineering, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversityInstitute of Basic Research in Clinical Medicine China Academy of Chinese Medical SciencesSchool of Life Sciences, Northwestern Polytechnical UniversityAbstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324 .https://doi.org/10.1038/s41467-025-56062-y |
spellingShingle | Peng-Cheng Zhao Xue-Xin Wei Qiong Wang Qi-Hao Wang Jia-Ning Li Jie Shang Cheng Lu Jian-Yu Shi Single-step retrosynthesis prediction via multitask graph representation learning Nature Communications |
title | Single-step retrosynthesis prediction via multitask graph representation learning |
title_full | Single-step retrosynthesis prediction via multitask graph representation learning |
title_fullStr | Single-step retrosynthesis prediction via multitask graph representation learning |
title_full_unstemmed | Single-step retrosynthesis prediction via multitask graph representation learning |
title_short | Single-step retrosynthesis prediction via multitask graph representation learning |
title_sort | single step retrosynthesis prediction via multitask graph representation learning |
url | https://doi.org/10.1038/s41467-025-56062-y |
work_keys_str_mv | AT pengchengzhao singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT xuexinwei singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT qiongwang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT qihaowang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT jianingli singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT jieshang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT chenglu singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning AT jianyushi singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning |