Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods

Designing energy-efficient buildings is essential for reducing global energy consumption and carbon emissions. However, traditional physics-based simulation models require substantial computational resources, detailed input data, and domain expertise. To address these limitations, this study investi...

Full description

Saved in:
Bibliographic Details
Main Authors: Navid Shirzadi, Dominic Lau, Meli Stylianou
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Buildings
Subjects:
Online Access:https://www.mdpi.com/2075-5309/15/13/2361
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849701662166351872
author Navid Shirzadi
Dominic Lau
Meli Stylianou
author_facet Navid Shirzadi
Dominic Lau
Meli Stylianou
author_sort Navid Shirzadi
collection DOAJ
description Designing energy-efficient buildings is essential for reducing global energy consumption and carbon emissions. However, traditional physics-based simulation models require substantial computational resources, detailed input data, and domain expertise. To address these limitations, this study investigates the use of three machine learning-based surrogate models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP)—trained on a synthetic dataset of 2000 EnergyPlus-simulated building design scenarios to predict both energy use intensity (EUI) and cost estimates for midrise apartment buildings in the Toronto area. All three models exhibit strong predictive performance, with R<sup>2</sup> values exceeding 0.9 for both EUI and cost. XGBoost achieves the best performance in cost prediction on the testing dataset with a root mean squared error (RMSE) of 5.13 CAD/m<sup>2</sup>, while MLP outperforms others in EUI prediction with a testing RMSE of 0.002 GJ/m<sup>2</sup>. In terms of computational efficiency, the surrogate models significantly outperform a physics-based simulation model, with MLP running approximately 340 times faster and XGBoost and RF achieving over 200 times speedup. This study also examines the effect of training dataset size on model performance, identifying a point of diminishing returns where further increases in data size yield minimal accuracy gains but substantially higher training times. To enhance model interpretability, SHapley Additive exPlanations (SHAP) analysis is used to quantify feature importance, revealing how different model types prioritize design parameters. A parametric design configuration analysis further evaluates the models’ sensitivity to changes in building envelope features. Overall, the findings demonstrate that machine learning-based surrogate models can serve as fast, accurate, and interpretable alternatives to traditional simulation methods, supporting efficient decision-making during early-stage building design.
format Article
id doaj-art-968fa04ae06440069bef2e3624a8709b
institution DOAJ
issn 2075-5309
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Buildings
spelling doaj-art-968fa04ae06440069bef2e3624a8709b2025-08-20T03:17:52ZengMDPI AGBuildings2075-53092025-07-011513236110.3390/buildings15132361Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based MethodsNavid Shirzadi0Dominic Lau1Meli Stylianou2CanmetENERGY-Ottawa, Natural Resources Canada, Ottawa, ON K1A 1M1, CanadaCanmetENERGY-Ottawa, Natural Resources Canada, Ottawa, ON K1A 1M1, CanadaCanmetENERGY-Ottawa, Natural Resources Canada, Ottawa, ON K1A 1M1, CanadaDesigning energy-efficient buildings is essential for reducing global energy consumption and carbon emissions. However, traditional physics-based simulation models require substantial computational resources, detailed input data, and domain expertise. To address these limitations, this study investigates the use of three machine learning-based surrogate models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP)—trained on a synthetic dataset of 2000 EnergyPlus-simulated building design scenarios to predict both energy use intensity (EUI) and cost estimates for midrise apartment buildings in the Toronto area. All three models exhibit strong predictive performance, with R<sup>2</sup> values exceeding 0.9 for both EUI and cost. XGBoost achieves the best performance in cost prediction on the testing dataset with a root mean squared error (RMSE) of 5.13 CAD/m<sup>2</sup>, while MLP outperforms others in EUI prediction with a testing RMSE of 0.002 GJ/m<sup>2</sup>. In terms of computational efficiency, the surrogate models significantly outperform a physics-based simulation model, with MLP running approximately 340 times faster and XGBoost and RF achieving over 200 times speedup. This study also examines the effect of training dataset size on model performance, identifying a point of diminishing returns where further increases in data size yield minimal accuracy gains but substantially higher training times. To enhance model interpretability, SHapley Additive exPlanations (SHAP) analysis is used to quantify feature importance, revealing how different model types prioritize design parameters. A parametric design configuration analysis further evaluates the models’ sensitivity to changes in building envelope features. Overall, the findings demonstrate that machine learning-based surrogate models can serve as fast, accurate, and interpretable alternatives to traditional simulation methods, supporting efficient decision-making during early-stage building design.https://www.mdpi.com/2075-5309/15/13/2361surrogate modelingmachine learningbuilding energy modelingenergy use intensitycost estimationearly-stage building design
spellingShingle Navid Shirzadi
Dominic Lau
Meli Stylianou
Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
Buildings
surrogate modeling
machine learning
building energy modeling
energy use intensity
cost estimation
early-stage building design
title Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
title_full Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
title_fullStr Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
title_full_unstemmed Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
title_short Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
title_sort surrogate modeling for building design energy and cost prediction compared to simulation based methods
topic surrogate modeling
machine learning
building energy modeling
energy use intensity
cost estimation
early-stage building design
url https://www.mdpi.com/2075-5309/15/13/2361
work_keys_str_mv AT navidshirzadi surrogatemodelingforbuildingdesignenergyandcostpredictioncomparedtosimulationbasedmethods
AT dominiclau surrogatemodelingforbuildingdesignenergyandcostpredictioncomparedtosimulationbasedmethods
AT melistylianou surrogatemodelingforbuildingdesignenergyandcostpredictioncomparedtosimulationbasedmethods