Materials-discovery workflow guided by symbolic regression for identifying acid-stable oxides for electrocatalysis

Abstract The efficiency of active learning (AL) approaches to identify materials with desired properties relies on the knowledge of a few parameters describing the property. However, these parameters are often unknown if the property is governed by a high intricacy of many atomistic processes. Here,...

Full description

Saved in:
Bibliographic Details
Main Authors: Akhil S. Nair, Lucas Foppa, Matthias Scheffler
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:npj Computational Materials
Online Access:https://doi.org/10.1038/s41524-025-01596-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The efficiency of active learning (AL) approaches to identify materials with desired properties relies on the knowledge of a few parameters describing the property. However, these parameters are often unknown if the property is governed by a high intricacy of many atomistic processes. Here, we develop an AL workflow based on the sure-independence screening and sparsifying operator (SISSO) symbolic regression approach. SISSO identifies analytical expressions correlated with a given materials property. These expressions depend on a few, key physical parameters, out of many offered primary features. Crucially, we train ensembles of SISSO models in order to quantify mean predictions and their uncertainty, enabling the use of SISSO in AL. We combine bootstrap sampling with Monte-Carlo dropout of primary features to obtain different datasets, which are used to train multiple SISSO models of the ensembles. The ensemble strategy improves the model performance with the feature dropout procedure alleviating the overconfidence issues observed for the widely used bagging ensemble approach. We demonstrate the SISSO-guided AL workflow by identifying acid-stable oxides for water splitting using high-quality DFT-HSE06 calculations. From a pool of 1470 materials, 12 acid-stable materials are identified in only 30 AL iterations. The materials-property maps provided by SISSO along with the uncertainty estimates reduce the risk of missing promising portions of the materials space that were overlooked in the initial, possibly biased dataset.
ISSN:2057-3960