Enhanced deep deterministic policy gradient algorithm

With the problem of slow convergence for deep deterministic policy gradient algorithm,an enhanced deep deterministic policy gradient algorithm was proposed.Based on the deep deterministic policy gradient algorithm,two sample pools were constructed,and the time difference error was introduced.The pri...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jianping CHEN, Chao HE, Quan LIU, Hongjie WU, Fuyuan HU, Qiming FU
Format:	Article
Language:	zho
Published:	Editorial Department of Journal on Communications 2018-11-01
Series:	Tongxin xuebao
Subjects:	deep reinforcement learning sample ranking bisimulation metric temporal difference error
Online Access:	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2018238/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	With the problem of slow convergence for deep deterministic policy gradient algorithm,an enhanced deep deterministic policy gradient algorithm was proposed.Based on the deep deterministic policy gradient algorithm,two sample pools were constructed,and the time difference error was introduced.The priority samples were added when the experience was played back.When the samples were trained,the samples were selected from two sample pools respectively.At the same time,the bisimulation metric was introduced to ensure the diversity of the selected samples and improve the convergence rate of the algorithm.The E-DDPG algorithm was used to pendulum problem.The experimental results show that the E-DDPG algorithm can effectively improve the convergence performance of the continuous action space problems and have better stability.
ISSN:	1000-436X

Enhanced deep deterministic policy gradient algorithm

Similar Items