Enhanced deep deterministic policy gradient algorithm

With the problem of slow convergence for deep deterministic policy gradient algorithm,an enhanced deep deterministic policy gradient algorithm was proposed.Based on the deep deterministic policy gradient algorithm,two sample pools were constructed,and the time difference error was introduced.The pri...

Full description

Saved in:
Bibliographic Details
Main Authors: Jianping CHEN, Chao HE, Quan LIU, Hongjie WU, Fuyuan HU, Qiming FU
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2018-11-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2018238/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539456802226176
author Jianping CHEN
Chao HE
Quan LIU
Hongjie WU
Fuyuan HU
Qiming FU
author_facet Jianping CHEN
Chao HE
Quan LIU
Hongjie WU
Fuyuan HU
Qiming FU
author_sort Jianping CHEN
collection DOAJ
description With the problem of slow convergence for deep deterministic policy gradient algorithm,an enhanced deep deterministic policy gradient algorithm was proposed.Based on the deep deterministic policy gradient algorithm,two sample pools were constructed,and the time difference error was introduced.The priority samples were added when the experience was played back.When the samples were trained,the samples were selected from two sample pools respectively.At the same time,the bisimulation metric was introduced to ensure the diversity of the selected samples and improve the convergence rate of the algorithm.The E-DDPG algorithm was used to pendulum problem.The experimental results show that the E-DDPG algorithm can effectively improve the convergence performance of the continuous action space problems and have better stability.
format Article
id doaj-art-c53ac5d18e434ccf89322f5661de1616
institution Kabale University
issn 1000-436X
language zho
publishDate 2018-11-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-c53ac5d18e434ccf89322f5661de16162025-01-14T07:15:47ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2018-11-013910611559721791Enhanced deep deterministic policy gradient algorithmJianping CHENChao HEQuan LIUHongjie WUFuyuan HUQiming FUWith the problem of slow convergence for deep deterministic policy gradient algorithm,an enhanced deep deterministic policy gradient algorithm was proposed.Based on the deep deterministic policy gradient algorithm,two sample pools were constructed,and the time difference error was introduced.The priority samples were added when the experience was played back.When the samples were trained,the samples were selected from two sample pools respectively.At the same time,the bisimulation metric was introduced to ensure the diversity of the selected samples and improve the convergence rate of the algorithm.The E-DDPG algorithm was used to pendulum problem.The experimental results show that the E-DDPG algorithm can effectively improve the convergence performance of the continuous action space problems and have better stability.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2018238/deep reinforcement learningsample rankingbisimulation metrictemporal difference error
spellingShingle Jianping CHEN
Chao HE
Quan LIU
Hongjie WU
Fuyuan HU
Qiming FU
Enhanced deep deterministic policy gradient algorithm
Tongxin xuebao
deep reinforcement learning
sample ranking
bisimulation metric
temporal difference error
title Enhanced deep deterministic policy gradient algorithm
title_full Enhanced deep deterministic policy gradient algorithm
title_fullStr Enhanced deep deterministic policy gradient algorithm
title_full_unstemmed Enhanced deep deterministic policy gradient algorithm
title_short Enhanced deep deterministic policy gradient algorithm
title_sort enhanced deep deterministic policy gradient algorithm
topic deep reinforcement learning
sample ranking
bisimulation metric
temporal difference error
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2018238/
work_keys_str_mv AT jianpingchen enhanceddeepdeterministicpolicygradientalgorithm
AT chaohe enhanceddeepdeterministicpolicygradientalgorithm
AT quanliu enhanceddeepdeterministicpolicygradientalgorithm
AT hongjiewu enhanceddeepdeterministicpolicygradientalgorithm
AT fuyuanhu enhanceddeepdeterministicpolicygradientalgorithm
AT qimingfu enhanceddeepdeterministicpolicygradientalgorithm