Benchmarking molecular conformer augmentation with context-enriched training: graph-based transformer versus GNN models
Abstract The field of molecular representation has witnessed a shift towards models trained on molecular structures represented by strings or graphs, with chemical information encoded in nodes and bonds. Graph-based representations offer a more realistic depiction and support 3D geometry and conform...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-05-01
|
| Series: | Journal of Cheminformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13321-025-01004-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract The field of molecular representation has witnessed a shift towards models trained on molecular structures represented by strings or graphs, with chemical information encoded in nodes and bonds. Graph-based representations offer a more realistic depiction and support 3D geometry and conformer-based augmentation. Graph Neural Networks (GNNs) and Graph-based Transformer models (GTs) represent two paradigms in this field, with GT models emerging as a flexible alternative. In this study, we compare the performance of GT models against GNN models on three datasets. We explore the impact of training procedures, including context-enriched training through pretraining on quantum mechanical atomic-level properties and auxiliary task training. Our analysis focuses on sterimol parameters estimation, binding energy estimation, and generalization performance for transition metal complexes. We find that GT models with context-enriched training provide on par results compared to GNN models, with the added advantages of speed and flexibility. Our findings highlight the potential of GT models as a valid alternative for molecular representation learning tasks. |
|---|---|
| ISSN: | 1758-2946 |