Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration

Mobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balan...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shripad V. Deshpande, Harikrishnan R, Babul Salam KSM Kader Ibrahim, Mahesh Datta Sai Ponnuru
Format:	Article
Language:	English
Published:	KeAi Communications Co. Ltd. 2024-01-01
Series:	Cognitive Robotics
Subjects:	Reinforcement learning Mobile robot system Differential gaming Deep deterministic policy gradient Epsilon greedy Actor-critic
Online Access:	http://www.sciencedirect.com/science/article/pii/S2667241324000119
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850255651481059328
author	Shripad V. Deshpande Harikrishnan R Babul Salam KSM Kader Ibrahim Mahesh Datta Sai Ponnuru
author_facet	Shripad V. Deshpande Harikrishnan R Babul Salam KSM Kader Ibrahim Mahesh Datta Sai Ponnuru
author_sort	Shripad V. Deshpande
collection	DOAJ
description	Mobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balance exploitation and exploration to enable effective learning. The balance between these actions directly impacts learning efficiency.This research proposes a method combining the DDPG strategy for exploitation with the Differential Gaming (DG) strategy for exploration. The DG algorithm ensures the mobile robot always reaches its target without collisions, thereby adding positive learning episodes to the memory buffer. An epsilon-greedy strategy determines whether to explore or exploit. When exploration is chosen, the DG algorithm is employed. The combination of DG strategy with DDPG facilitates faster learning by increasing the number of successful episodes and reducing the number of failure episodes in the experience buffer. The DDPG algorithm supports continuous state and action spaces, resulting in smoother, non-jerky movements and improved control over the turns when navigating obstacles. Reward shaping considers finer details, ensuring even small advantages in each iteration contribute to learning.Through diverse test scenarios, it is demonstrated that DG exploration, compared to random exploration, results in an average increase of 389% in successful target reaches and a 39% decrease in collisions. Additionally, DG exploration shows a 69% improvement in the number of episodes where convergence is achieved within a maximum of 2000 steps.
format	Article
id	doaj-art-ce9c8ea47f1447d2af5ea4676a3a3cce
institution	OA Journals
issn	2667-2413
language	English
publishDate	2024-01-01
publisher	KeAi Communications Co. Ltd.
record_format	Article
series	Cognitive Robotics
spelling	doaj-art-ce9c8ea47f1447d2af5ea4676a3a3cce2025-08-20T01:56:49ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132024-01-01415617310.1016/j.cogr.2024.08.002Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) explorationShripad V. Deshpande0Harikrishnan R1Babul Salam KSM Kader Ibrahim2Mahesh Datta Sai Ponnuru3Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India, 412115Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India, 412115; Corresponding author.GUST Engineering & Applied Innovation Research Centre (GEAR), Gulf University for Science & Technology, Hawally, KuwaitDepartment of Computational Intelligence, School of Computing, SRM Institute of Science and Technology, Kattankulathur, 603203, IndiaMobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balance exploitation and exploration to enable effective learning. The balance between these actions directly impacts learning efficiency.This research proposes a method combining the DDPG strategy for exploitation with the Differential Gaming (DG) strategy for exploration. The DG algorithm ensures the mobile robot always reaches its target without collisions, thereby adding positive learning episodes to the memory buffer. An epsilon-greedy strategy determines whether to explore or exploit. When exploration is chosen, the DG algorithm is employed. The combination of DG strategy with DDPG facilitates faster learning by increasing the number of successful episodes and reducing the number of failure episodes in the experience buffer. The DDPG algorithm supports continuous state and action spaces, resulting in smoother, non-jerky movements and improved control over the turns when navigating obstacles. Reward shaping considers finer details, ensuring even small advantages in each iteration contribute to learning.Through diverse test scenarios, it is demonstrated that DG exploration, compared to random exploration, results in an average increase of 389% in successful target reaches and a 39% decrease in collisions. Additionally, DG exploration shows a 69% improvement in the number of episodes where convergence is achieved within a maximum of 2000 steps.http://www.sciencedirect.com/science/article/pii/S2667241324000119Reinforcement learningMobile robot systemDifferential gamingDeep deterministic policy gradientEpsilon greedyActor-critic
spellingShingle	Shripad V. Deshpande Harikrishnan R Babul Salam KSM Kader Ibrahim Mahesh Datta Sai Ponnuru Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration Cognitive Robotics Reinforcement learning Mobile robot system Differential gaming Deep deterministic policy gradient Epsilon greedy Actor-critic
title	Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_full	Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_fullStr	Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_full_unstemmed	Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_short	Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_sort	mobile robot path planning using deep deterministic policy gradient with differential gaming ddpg dg exploration
topic	Reinforcement learning Mobile robot system Differential gaming Deep deterministic policy gradient Epsilon greedy Actor-critic
url	http://www.sciencedirect.com/science/article/pii/S2667241324000119
work_keys_str_mv	AT shripadvdeshpande mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration AT harikrishnanr mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration AT babulsalamksmkaderibrahim mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration AT maheshdattasaiponnuru mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration

Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration

Similar Items