Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Path planning is an essential topic of robotics studies. Robotic researchers have suggested some methods such as particle swarm optimization, A*, and reinforcement learning (RL) to obtain a path. In the current study, it was aimed to generate RL-based safe path planning for a 3R planar robot. For th...

Full description

Saved in:

Bibliographic Details
Main Author:	Mustafa Can Bingol
Format:	Article
Language:	English
Published:	Sakarya University 2022-02-01
Series:	Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi
Subjects:	artificial neural networks deep deterministic policy gradients path planning reinforcement learning
Online Access:	https://dergipark.org.tr/tr/download/article-file/1693285
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850252338822905856
author	Mustafa Can Bingol
author_facet	Mustafa Can Bingol
author_sort	Mustafa Can Bingol
collection	DOAJ
description	Path planning is an essential topic of robotics studies. Robotic researchers have suggested some methods such as particle swarm optimization, A*, and reinforcement learning (RL) to obtain a path. In the current study, it was aimed to generate RL-based safe path planning for a 3R planar robot. For this purpose, firstly, the environment was performed. Later, state, action, reward, and terminate functions were determined. Lastly, actor and critic artificial neural networks (ANN), which are basic components of deep deterministic policy gradients (DDPG), were formed in order to generate a safe path. Another aim of the current study was to obtain an optimum actor ANN. Different ANN structures that have 2, 4, and 8-layers and 512, 1024, 2048, and 4096-units were formed to get an optimum actor ANN. These formed ANN structures were trained during 5000 episodes and 200 steps and the best results were obtained by 4-layer, 1024, and 2048-units structures. Owing to this reason, 4 different ANN structures were performed utilizing 4-layer, 1024, and 2048-units. The proposed structures were trained. The NET-M2U-4L structure generated the best result among 4 different proposed structures. The NET-M2U-4L structure was tested by using 1000 different scenarios. As a result of the tests, the rate of generating a safe path was calculated as 93.80% and the rate of colliding to the obstacle was computed as 1.70%. As a consequence, a safe path was planned and an optimum actor ANN was obtained for a 3R planar robot.
format	Article
id	doaj-art-ad4a2a8b68ed47529c600b445a4af594
institution	OA Journals
issn	2147-835X
language	English
publishDate	2022-02-01
publisher	Sakarya University
record_format	Article
series	Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi
spelling	doaj-art-ad4a2a8b68ed47529c600b445a4af5942025-08-20T01:57:40ZengSakarya UniversitySakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi2147-835X2022-02-0126112813510.16984/saufenbilder.91194228Reinforcement Learning-Based Safe Path Planning for a 3R Planar RobotMustafa Can Bingol0https://orcid.org/0000-0001-5448-8281FIRAT ÜNİVERSİTESİPath planning is an essential topic of robotics studies. Robotic researchers have suggested some methods such as particle swarm optimization, A*, and reinforcement learning (RL) to obtain a path. In the current study, it was aimed to generate RL-based safe path planning for a 3R planar robot. For this purpose, firstly, the environment was performed. Later, state, action, reward, and terminate functions were determined. Lastly, actor and critic artificial neural networks (ANN), which are basic components of deep deterministic policy gradients (DDPG), were formed in order to generate a safe path. Another aim of the current study was to obtain an optimum actor ANN. Different ANN structures that have 2, 4, and 8-layers and 512, 1024, 2048, and 4096-units were formed to get an optimum actor ANN. These formed ANN structures were trained during 5000 episodes and 200 steps and the best results were obtained by 4-layer, 1024, and 2048-units structures. Owing to this reason, 4 different ANN structures were performed utilizing 4-layer, 1024, and 2048-units. The proposed structures were trained. The NET-M2U-4L structure generated the best result among 4 different proposed structures. The NET-M2U-4L structure was tested by using 1000 different scenarios. As a result of the tests, the rate of generating a safe path was calculated as 93.80% and the rate of colliding to the obstacle was computed as 1.70%. As a consequence, a safe path was planned and an optimum actor ANN was obtained for a 3R planar robot.https://dergipark.org.tr/tr/download/article-file/1693285artificial neural networksdeep deterministic policy gradientspath planningreinforcement learning
spellingShingle	Mustafa Can Bingol Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi artificial neural networks deep deterministic policy gradients path planning reinforcement learning
title	Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot
title_full	Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot
title_fullStr	Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot
title_full_unstemmed	Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot
title_short	Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot
title_sort	reinforcement learning based safe path planning for a 3r planar robot
topic	artificial neural networks deep deterministic policy gradients path planning reinforcement learning
url	https://dergipark.org.tr/tr/download/article-file/1693285
work_keys_str_mv	AT mustafacanbingol reinforcementlearningbasedsafepathplanningfora3rplanarrobot

Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Similar Items