VAE-Assisted Data Augmentation for Improved Molecular Prediction with Graph Neural Networks (GNNs) in Low-Data Regimes
This study presents a novel approach to enhancing molecular property prediction through variational autoencoder (VAE)-assisted data augmentation in low-data regimes. The methodology combines graph neural networks (GNNs) with VAEs to improve predictive accuracy on molecular datasets from MoleculeNet,...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
AIDIC Servizi S.r.l.
2025-07-01
|
| Series: | Chemical Engineering Transactions |
| Online Access: | https://www.cetjournal.it/index.php/cet/article/view/15421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This study presents a novel approach to enhancing molecular property prediction through variational autoencoder (VAE)-assisted data augmentation in low-data regimes. The methodology combines graph neural networks (GNNs) with VAEs to improve predictive accuracy on molecular datasets from MoleculeNet, specifically ESOL (water solubility) and FreeSolv (hydration-free energy). By generating chemically valid molecules that align with the original dataset's chemical space, the approach enhances model performance, particularly for graph attention networks (GATs). Results show significant improvements in prediction accuracy, with GAT models demonstrating increased R² values from 0.879 to 0.918 for FreeSolv and 0.873 to 0.885 for ESOL when trained on augmented datasets. The study validates the effectiveness of VAE-generated molecules through chemical space analysis and property distribution comparisons, offering a promising solution for molecular property prediction in data-limited scenarios. |
|---|---|
| ISSN: | 2283-9216 |