Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism

Abstract Reinforcement learning algorithms that handle continuous action spaces have the problem of slow convergence and local optimality. Hence, we propose a deep deterministic policy gradient algorithm based on the dung beetle optimization algorithm (DBOP–DDPG) and priority experience replay mecha...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hengwei Zhu, Chuiting Rong, Haorui Liu
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-04-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-025-99213-3
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850170886656622592
author	Hengwei Zhu Chuiting Rong Haorui Liu
author_facet	Hengwei Zhu Chuiting Rong Haorui Liu
author_sort	Hengwei Zhu
collection	DOAJ
description	Abstract Reinforcement learning algorithms that handle continuous action spaces have the problem of slow convergence and local optimality. Hence, we propose a deep deterministic policy gradient algorithm based on the dung beetle optimization algorithm (DBOP–DDPG) and priority experience replay mechanism. This method first adopts the simultaneous search policy of multiple populations by introducing the dung beetle optimizer (DBO), which can effectively keep the algorithm from falling into the local optimum solution and improve global optimization capability. Then, we design a criterion for determining the priority of sample data. The experience replay mechanism sampling is improved, and sample data in the experience replay mechanism are stored in three replay mechanisms based on importance for subsequent sampling training to then improve the algorithm’s convergence speed. Finally, tests were conducted in three classic control environments of OpenAI Gym. The results showed that the improved method improved the convergence speed by at least 10% compared with the comparison algorithm, and the cumulative reward value was increased by up to 150.
format	Article
id	doaj-art-70fd3929faa94e5dabfb763dc1be0bd4
institution	OA Journals
issn	2045-2322
language	English
publishDate	2025-04-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-70fd3929faa94e5dabfb763dc1be0bd42025-08-20T02:20:23ZengNature PortfolioScientific Reports2045-23222025-04-0115111410.1038/s41598-025-99213-3Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanismHengwei Zhu0Chuiting Rong1Haorui Liu2College of Computer and Information Engineering, Dezhou UniversityCollege of Computer and Information Engineering, Dezhou UniversityCollege of Computer and Information Engineering, Dezhou UniversityAbstract Reinforcement learning algorithms that handle continuous action spaces have the problem of slow convergence and local optimality. Hence, we propose a deep deterministic policy gradient algorithm based on the dung beetle optimization algorithm (DBOP–DDPG) and priority experience replay mechanism. This method first adopts the simultaneous search policy of multiple populations by introducing the dung beetle optimizer (DBO), which can effectively keep the algorithm from falling into the local optimum solution and improve global optimization capability. Then, we design a criterion for determining the priority of sample data. The experience replay mechanism sampling is improved, and sample data in the experience replay mechanism are stored in three replay mechanisms based on importance for subsequent sampling training to then improve the algorithm’s convergence speed. Finally, tests were conducted in three classic control environments of OpenAI Gym. The results showed that the improved method improved the convergence speed by at least 10% compared with the comparison algorithm, and the cumulative reward value was increased by up to 150.https://doi.org/10.1038/s41598-025-99213-3
spellingShingle	Hengwei Zhu Chuiting Rong Haorui Liu Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism Scientific Reports
title	Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
title_full	Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
title_fullStr	Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
title_full_unstemmed	Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
title_short	Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
title_sort	deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism
url	https://doi.org/10.1038/s41598-025-99213-3
work_keys_str_mv	AT hengweizhu deepdeterministicpolicygradientalgorithmbasedondungbeetleoptimizationandpriorityexperiencereplaymechanism AT chuitingrong deepdeterministicpolicygradientalgorithmbasedondungbeetleoptimizationandpriorityexperiencereplaymechanism AT haoruiliu deepdeterministicpolicygradientalgorithmbasedondungbeetleoptimizationandpriorityexperiencereplaymechanism

Deep deterministic policy gradient algorithm based on dung beetle optimization and priority experience replay mechanism

Similar Items