An Effective Pipeline for Training Variational Autoencoders for Synthesizable and Optimized Molecular Design
Variational auto-encoders (VAE) for molecular design and optimization have gained popularity due to their efficiency in exploring high-dimensional molecular space to identify novel molecules with various properties of interest. For example, when applied to drug discovery, one may want to optimize sm...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10817611/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Variational auto-encoders (VAE) for molecular design and optimization have gained popularity due to their efficiency in exploring high-dimensional molecular space to identify novel molecules with various properties of interest. For example, when applied to drug discovery, one may want to optimize small molecules for higher affinity against a specific target, improved bioavailability, and solubility. However, many such existing models face challenges in producing molecules that not only possess the desired properties but can be synthesized in practice. On the other hand, there exist synthetic route decoding models that can decode a molecule’s synthetic route using known reactants and reaction templates and translate it into a synthesizable version of that molecule. However, these models may not be suitable for generating molecules with diverse optimized properties. In this paper, we aim to combine the strengths of the generative and synthetic route decoding models to overcome the shortcomings of the respective models when used separately. Specifically, we propose a practical framework for enhancing the synthesizability of property-optimized molecules suggested by VAE models by retraining the original baseline model with a constrained dataset curated using a synthetic route decoder. To demonstrate its efficacy, we apply the framework to a hierarchical molecular generation model built on a VAE and show that the enhanced model can effectively produce diverse molecules with significantly improved synthesizability. |
|---|---|
| ISSN: | 2169-3536 |