Screening of multi deep learning-based de novo molecular generation models and their application for specific target molecular generation
Abstract Traditional virtual screening methods need to explore expanse and vast chemical spaces and need to be based on existing chemical libraries. With the development of deep learning techniques for the de novo generation of molecules, also known as inverse molecular design, the increasingly wide...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-02-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-86840-z |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract Traditional virtual screening methods need to explore expanse and vast chemical spaces and need to be based on existing chemical libraries. With the development of deep learning techniques for the de novo generation of molecules, also known as inverse molecular design, the increasingly widespread application of various types of deep learning algorithms has led to revolutionary changes in de novo molecular generation research. In particular, the emergence of a novel natural language processing (NLP) architecture called the transformer has improved the state-of-the-art performance of existing AI technologies. In this study, we modified one top-performing molecular generation model on the basis of the generative pretraining transformer (GPT) architecture in three directions. Moreover, we propose an integrated end-to-end neural network learning framework based on one complete encoder-decoder architecture transformer model: Transfer Text-to-Text Transformer (T5), by learning the embedding vector representation space of conditional molecular properties to encode and guide the vector representation of SMILES sequences, resulting in the output of the final decoder block with a softmax output (maximum likelihood objective). Moreover, we evaluated the performance of these NLP-based generation models and another new model architecture based on a selective state space and selected the best approach jointing a transfer learning strategy for de novo drug discovery to target L858R/T790M/C797S-mutant EGFR in non-small cell lung cancer. |
---|---|
ISSN: | 2045-2322 |