Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration

Mobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balan...

Full description

Saved in:
Bibliographic Details
Main Authors: Shripad V. Deshpande, Harikrishnan R, Babul Salam KSM Kader Ibrahim, Mahesh Datta Sai Ponnuru
Format: Article
Language:English
Published: KeAi Communications Co. Ltd. 2024-01-01
Series:Cognitive Robotics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667241324000119
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850255651481059328
author Shripad V. Deshpande
Harikrishnan R
Babul Salam KSM Kader Ibrahim
Mahesh Datta Sai Ponnuru
author_facet Shripad V. Deshpande
Harikrishnan R
Babul Salam KSM Kader Ibrahim
Mahesh Datta Sai Ponnuru
author_sort Shripad V. Deshpande
collection DOAJ
description Mobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balance exploitation and exploration to enable effective learning. The balance between these actions directly impacts learning efficiency.This research proposes a method combining the DDPG strategy for exploitation with the Differential Gaming (DG) strategy for exploration. The DG algorithm ensures the mobile robot always reaches its target without collisions, thereby adding positive learning episodes to the memory buffer. An epsilon-greedy strategy determines whether to explore or exploit. When exploration is chosen, the DG algorithm is employed. The combination of DG strategy with DDPG facilitates faster learning by increasing the number of successful episodes and reducing the number of failure episodes in the experience buffer. The DDPG algorithm supports continuous state and action spaces, resulting in smoother, non-jerky movements and improved control over the turns when navigating obstacles. Reward shaping considers finer details, ensuring even small advantages in each iteration contribute to learning.Through diverse test scenarios, it is demonstrated that DG exploration, compared to random exploration, results in an average increase of 389% in successful target reaches and a 39% decrease in collisions. Additionally, DG exploration shows a 69% improvement in the number of episodes where convergence is achieved within a maximum of 2000 steps.
format Article
id doaj-art-ce9c8ea47f1447d2af5ea4676a3a3cce
institution OA Journals
issn 2667-2413
language English
publishDate 2024-01-01
publisher KeAi Communications Co. Ltd.
record_format Article
series Cognitive Robotics
spelling doaj-art-ce9c8ea47f1447d2af5ea4676a3a3cce2025-08-20T01:56:49ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132024-01-01415617310.1016/j.cogr.2024.08.002Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) explorationShripad V. Deshpande0Harikrishnan R1Babul Salam KSM Kader Ibrahim2Mahesh Datta Sai Ponnuru3Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India, 412115Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India, 412115; Corresponding author.GUST Engineering & Applied Innovation Research Centre (GEAR), Gulf University for Science & Technology, Hawally, KuwaitDepartment of Computational Intelligence, School of Computing, SRM Institute of Science and Technology, Kattankulathur, 603203, IndiaMobile robot path planning involves decision-making in uncertain, dynamic conditions, where Reinforcement Learning (RL) algorithms excel in generating safe and optimal paths. The Deep Deterministic Policy Gradient (DDPG) is an RL technique focused on mobile robot navigation. RL algorithms must balance exploitation and exploration to enable effective learning. The balance between these actions directly impacts learning efficiency.This research proposes a method combining the DDPG strategy for exploitation with the Differential Gaming (DG) strategy for exploration. The DG algorithm ensures the mobile robot always reaches its target without collisions, thereby adding positive learning episodes to the memory buffer. An epsilon-greedy strategy determines whether to explore or exploit. When exploration is chosen, the DG algorithm is employed. The combination of DG strategy with DDPG facilitates faster learning by increasing the number of successful episodes and reducing the number of failure episodes in the experience buffer. The DDPG algorithm supports continuous state and action spaces, resulting in smoother, non-jerky movements and improved control over the turns when navigating obstacles. Reward shaping considers finer details, ensuring even small advantages in each iteration contribute to learning.Through diverse test scenarios, it is demonstrated that DG exploration, compared to random exploration, results in an average increase of 389% in successful target reaches and a 39% decrease in collisions. Additionally, DG exploration shows a 69% improvement in the number of episodes where convergence is achieved within a maximum of 2000 steps.http://www.sciencedirect.com/science/article/pii/S2667241324000119Reinforcement learningMobile robot systemDifferential gamingDeep deterministic policy gradientEpsilon greedyActor-critic
spellingShingle Shripad V. Deshpande
Harikrishnan R
Babul Salam KSM Kader Ibrahim
Mahesh Datta Sai Ponnuru
Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
Cognitive Robotics
Reinforcement learning
Mobile robot system
Differential gaming
Deep deterministic policy gradient
Epsilon greedy
Actor-critic
title Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_full Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_fullStr Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_full_unstemmed Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_short Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration
title_sort mobile robot path planning using deep deterministic policy gradient with differential gaming ddpg dg exploration
topic Reinforcement learning
Mobile robot system
Differential gaming
Deep deterministic policy gradient
Epsilon greedy
Actor-critic
url http://www.sciencedirect.com/science/article/pii/S2667241324000119
work_keys_str_mv AT shripadvdeshpande mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration
AT harikrishnanr mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration
AT babulsalamksmkaderibrahim mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration
AT maheshdattasaiponnuru mobilerobotpathplanningusingdeepdeterministicpolicygradientwithdifferentialgamingddpgdgexploration