QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules
Abstract Progress in both Machine Learning (ML) and Quantum Chemistry (QC) methods have resulted in high accuracy ML models for QC properties. Datasets such as MD17 and WS22 have been used to benchmark these models at a given level of QC method, or fidelity, which refers to the accuracy of the chose...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-02-01
|
Series: | Scientific Data |
Online Access: | https://doi.org/10.1038/s41597-024-04247-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823863325030612992 |
---|---|
author | Vivin Vinod Peter Zaspel |
author_facet | Vivin Vinod Peter Zaspel |
author_sort | Vivin Vinod |
collection | DOAJ |
description | Abstract Progress in both Machine Learning (ML) and Quantum Chemistry (QC) methods have resulted in high accuracy ML models for QC properties. Datasets such as MD17 and WS22 have been used to benchmark these models at a given level of QC method, or fidelity, which refers to the accuracy of the chosen QC method. Multifidelity ML (MFML) methods, where models are trained on data from more than one fidelity, have shown to be effective over single fidelity methods. Much research is progressing in this direction for diverse applications ranging from energy band gaps to excitation energies. One hurdle for effective research here is the lack of a diverse multifidelity dataset for benchmarking. We provide the Quantum chemistry MultiFidelity (QeMFi) dataset consisting of five fidelities calculated with the TD-DFT formalism. The fidelities differ in their basis set choice: STO-3G, 3-21G, 6-31G, def2-SVP, and def2-TZVP. QeMFi offers to the community a variety of QC properties such as vertical excitation properties and molecular dipole moments. Further QeMFi offers QC computation times allowing for a time benefit benchmark of multifidelity models for ML-QC. |
format | Article |
id | doaj-art-fde75014a2e14393b17bbda00780b4b2 |
institution | Kabale University |
issn | 2052-4463 |
language | English |
publishDate | 2025-02-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Data |
spelling | doaj-art-fde75014a2e14393b17bbda00780b4b22025-02-09T12:11:47ZengNature PortfolioScientific Data2052-44632025-02-0112111310.1038/s41597-024-04247-3QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse MoleculesVivin Vinod0Peter Zaspel1School of Mathematics and Natural Sciences, University of WuppertalSchool of Mathematics and Natural Sciences, University of WuppertalAbstract Progress in both Machine Learning (ML) and Quantum Chemistry (QC) methods have resulted in high accuracy ML models for QC properties. Datasets such as MD17 and WS22 have been used to benchmark these models at a given level of QC method, or fidelity, which refers to the accuracy of the chosen QC method. Multifidelity ML (MFML) methods, where models are trained on data from more than one fidelity, have shown to be effective over single fidelity methods. Much research is progressing in this direction for diverse applications ranging from energy band gaps to excitation energies. One hurdle for effective research here is the lack of a diverse multifidelity dataset for benchmarking. We provide the Quantum chemistry MultiFidelity (QeMFi) dataset consisting of five fidelities calculated with the TD-DFT formalism. The fidelities differ in their basis set choice: STO-3G, 3-21G, 6-31G, def2-SVP, and def2-TZVP. QeMFi offers to the community a variety of QC properties such as vertical excitation properties and molecular dipole moments. Further QeMFi offers QC computation times allowing for a time benefit benchmark of multifidelity models for ML-QC.https://doi.org/10.1038/s41597-024-04247-3 |
spellingShingle | Vivin Vinod Peter Zaspel QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules Scientific Data |
title | QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules |
title_full | QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules |
title_fullStr | QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules |
title_full_unstemmed | QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules |
title_short | QeMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules |
title_sort | qemfi a multifidelity dataset of quantum chemical properties of diverse molecules |
url | https://doi.org/10.1038/s41597-024-04247-3 |
work_keys_str_mv | AT vivinvinod qemfiamultifidelitydatasetofquantumchemicalpropertiesofdiversemolecules AT peterzaspel qemfiamultifidelitydatasetofquantumchemicalpropertiesofdiversemolecules |