Efficient and assured reinforcement learning-based building HVAC control with heterogeneous expert-guided training
Abstract Building heating, ventilation, and air conditioning (HVAC) systems account for nearly half of building energy consumption and $$20\%$$ of total energy consumption in the US. Their operation is also crucial for ensuring the physical and mental health of building occupants. Compared with trad...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-91326-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Building heating, ventilation, and air conditioning (HVAC) systems account for nearly half of building energy consumption and $$20\%$$ of total energy consumption in the US. Their operation is also crucial for ensuring the physical and mental health of building occupants. Compared with traditional model-based HVAC control methods, the recent model-free deep reinforcement learning (DRL) based methods have shown good performance while do not require the development of detailed and costly physical models. However, these model-free DRL approaches often suffer from long training time to reach a good performance, which is a major obstacle for their practical deployment. In this work, we present a systematic approach to accelerate online reinforcement learning for HVAC control by taking full advantage of the knowledge from domain experts in various forms. Specifically, the algorithm stages include learning expert functions from existing abstract physical models and from historical data via offline reinforcement learning, integrating the expert functions with rule-based guidelines, conducting training guided by the integrated expert function and performing policy initialization from distilled expert function. Moreover, to ensure that the learned DRL-based HVAC controller can effectively keep room temperature within the comfortable range for occupants, we design a runtime shielding framework to reduce the temperature violation rate and incorporate the learned controller into it. Experimental results demonstrate up to 8.8X speedup in DRL training from our approach over previous methods, with low temperature violation rate. |
|---|---|
| ISSN: | 2045-2322 |