Single-step retrosynthesis prediction via multitask graph representation learning

Abstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; templat...

Full description

Saved in:
Bibliographic Details
Main Authors: Peng-Cheng Zhao, Xue-Xin Wei, Qiong Wang, Qi-Hao Wang, Jia-Ning Li, Jie Shang, Cheng Lu, Jian-Yu Shi
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-56062-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832594608172826624
author Peng-Cheng Zhao
Xue-Xin Wei
Qiong Wang
Qi-Hao Wang
Jia-Ning Li
Jie Shang
Cheng Lu
Jian-Yu Shi
author_facet Peng-Cheng Zhao
Xue-Xin Wei
Qiong Wang
Qi-Hao Wang
Jia-Ning Li
Jie Shang
Cheng Lu
Jian-Yu Shi
author_sort Peng-Cheng Zhao
collection DOAJ
description Abstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324 .
format Article
id doaj-art-2ccca88d196241efb9889a4502da9e1f
institution Kabale University
issn 2041-1723
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-2ccca88d196241efb9889a4502da9e1f2025-01-19T12:30:08ZengNature PortfolioNature Communications2041-17232025-01-0116111910.1038/s41467-025-56062-ySingle-step retrosynthesis prediction via multitask graph representation learningPeng-Cheng Zhao0Xue-Xin Wei1Qiong Wang2Qi-Hao Wang3Jia-Ning Li4Jie Shang5Cheng Lu6Jian-Yu Shi7School of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Chemistry and Chemical Engineering, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversitySchool of Life Sciences, Northwestern Polytechnical UniversityInstitute of Basic Research in Clinical Medicine China Academy of Chinese Medical SciencesSchool of Life Sciences, Northwestern Polytechnical UniversityAbstract Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324 .https://doi.org/10.1038/s41467-025-56062-y
spellingShingle Peng-Cheng Zhao
Xue-Xin Wei
Qiong Wang
Qi-Hao Wang
Jia-Ning Li
Jie Shang
Cheng Lu
Jian-Yu Shi
Single-step retrosynthesis prediction via multitask graph representation learning
Nature Communications
title Single-step retrosynthesis prediction via multitask graph representation learning
title_full Single-step retrosynthesis prediction via multitask graph representation learning
title_fullStr Single-step retrosynthesis prediction via multitask graph representation learning
title_full_unstemmed Single-step retrosynthesis prediction via multitask graph representation learning
title_short Single-step retrosynthesis prediction via multitask graph representation learning
title_sort single step retrosynthesis prediction via multitask graph representation learning
url https://doi.org/10.1038/s41467-025-56062-y
work_keys_str_mv AT pengchengzhao singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT xuexinwei singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT qiongwang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT qihaowang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT jianingli singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT jieshang singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT chenglu singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning
AT jianyushi singlestepretrosynthesispredictionviamultitaskgraphrepresentationlearning