Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning

Maximizing the return on electric vehicle charging station (EVCS) operation helps to expand the EVCS, thus expanding the EV (electric vehicle) stock and better addressing climate change. However, in the face of dynamic regulation scenarios with large data, multiple variables, and low time scales, th...

Full description

Saved in:
Bibliographic Details
Main Authors: Hengjie Li, Jianghao Zhu, Yun Zhou, Qi Feng, Donghan Feng
Format: Article
Language:English
Published: Wiley 2022-01-01
Series:International Transactions on Electrical Energy Systems
Online Access:http://dx.doi.org/10.1155/2022/6854620
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850165084403269632
author Hengjie Li
Jianghao Zhu
Yun Zhou
Qi Feng
Donghan Feng
author_facet Hengjie Li
Jianghao Zhu
Yun Zhou
Qi Feng
Donghan Feng
author_sort Hengjie Li
collection DOAJ
description Maximizing the return on electric vehicle charging station (EVCS) operation helps to expand the EVCS, thus expanding the EV (electric vehicle) stock and better addressing climate change. However, in the face of dynamic regulation scenarios with large data, multiple variables, and low time scales, the existing regulation strategies aiming at maximizing EVCS returns many times fail to meet the demand. To handle increasingly complex regulation scenarios, a deep reinforcement learning algorithm (DRL) based on the improved twin delayed deep deterministic policy gradient (TD3) is used to construct basic energy management strategies in this paper. To enable the strategy to be more suitable for the goal of real-time energy regulation strategy, we used Thompson sampling strategy to improve TD3’s exploration noise sampling strategy, which greatly accelerated the initial convergence of TD3 during training. Also, we use marginalised importance sampling to calculate the Q-return function for TD3, which ensures that the constructed strategies are more likely to learn high-value experiences while having higher robustness. It is shown in numerical experiments that the charging station management strategy (CSMS) based on the modified TD3 obtains the fastest convergence speed and the highest robustness and achieves the largest operational returns compared to the CSMS constructed using deep deterministic policy gradient (DDPG), actor-critic using Kronecker-factored trust region (ACKTR), trust region policy optimization (TRPO), proximal policy optimization (PPO), soft actor-critic (SAC), and the original TD3.
format Article
id doaj-art-ec7d650448af47daa7ab83446a74bcd2
institution OA Journals
issn 2050-7038
language English
publishDate 2022-01-01
publisher Wiley
record_format Article
series International Transactions on Electrical Energy Systems
spelling doaj-art-ec7d650448af47daa7ab83446a74bcd22025-08-20T02:21:49ZengWileyInternational Transactions on Electrical Energy Systems2050-70382022-01-01202210.1155/2022/6854620Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement LearningHengjie Li0Jianghao Zhu1Yun Zhou2Qi Feng3Donghan Feng4School of Electrical Engineering and Information EngineeringSchool of Electrical Engineering and Information EngineeringSchool of Electrical Engineering and Information EngineeringSchool of Electrical Engineering and Information EngineeringSchool of Electrical Engineering and Information EngineeringMaximizing the return on electric vehicle charging station (EVCS) operation helps to expand the EVCS, thus expanding the EV (electric vehicle) stock and better addressing climate change. However, in the face of dynamic regulation scenarios with large data, multiple variables, and low time scales, the existing regulation strategies aiming at maximizing EVCS returns many times fail to meet the demand. To handle increasingly complex regulation scenarios, a deep reinforcement learning algorithm (DRL) based on the improved twin delayed deep deterministic policy gradient (TD3) is used to construct basic energy management strategies in this paper. To enable the strategy to be more suitable for the goal of real-time energy regulation strategy, we used Thompson sampling strategy to improve TD3’s exploration noise sampling strategy, which greatly accelerated the initial convergence of TD3 during training. Also, we use marginalised importance sampling to calculate the Q-return function for TD3, which ensures that the constructed strategies are more likely to learn high-value experiences while having higher robustness. It is shown in numerical experiments that the charging station management strategy (CSMS) based on the modified TD3 obtains the fastest convergence speed and the highest robustness and achieves the largest operational returns compared to the CSMS constructed using deep deterministic policy gradient (DDPG), actor-critic using Kronecker-factored trust region (ACKTR), trust region policy optimization (TRPO), proximal policy optimization (PPO), soft actor-critic (SAC), and the original TD3.http://dx.doi.org/10.1155/2022/6854620
spellingShingle Hengjie Li
Jianghao Zhu
Yun Zhou
Qi Feng
Donghan Feng
Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
International Transactions on Electrical Energy Systems
title Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
title_full Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
title_fullStr Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
title_full_unstemmed Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
title_short Charging Station Management Strategy for Returns Maximization via Improved TD3 Deep Reinforcement Learning
title_sort charging station management strategy for returns maximization via improved td3 deep reinforcement learning
url http://dx.doi.org/10.1155/2022/6854620
work_keys_str_mv AT hengjieli chargingstationmanagementstrategyforreturnsmaximizationviaimprovedtd3deepreinforcementlearning
AT jianghaozhu chargingstationmanagementstrategyforreturnsmaximizationviaimprovedtd3deepreinforcementlearning
AT yunzhou chargingstationmanagementstrategyforreturnsmaximizationviaimprovedtd3deepreinforcementlearning
AT qifeng chargingstationmanagementstrategyforreturnsmaximizationviaimprovedtd3deepreinforcementlearning
AT donghanfeng chargingstationmanagementstrategyforreturnsmaximizationviaimprovedtd3deepreinforcementlearning