Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps

Deep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is c...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tareq A. Fahmy, Omar M. Shehata, Shady A. Maged
Format:	Article
Language:	English
Published:	MDPI AG 2024-11-01
Series:	Robotics
Subjects:	deep reinforcement learning navigation multichannel cost map trajectory aware spatial and temporal representation POMDP (partially observable Markov decision process)
Online Access:	https://www.mdpi.com/2218-6581/13/11/166
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846152531514753024
author	Tareq A. Fahmy Omar M. Shehata Shady A. Maged
author_facet	Tareq A. Fahmy Omar M. Shehata Shady A. Maged
author_sort	Tareq A. Fahmy
collection	DOAJ
description	Deep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is contained in a single observation of the current state) for structuring the learning process; the partially observable Markov property in the DRL navigation problem is significantly amplified when dealing with dynamic obstacles. A single observation or measurement of the environment is often insufficient for capturing the dynamic behavior of obstacles, thereby hindering the agent’s decision-making. This study addresses this challenge by using an environment-specific heuristic approach to augment the dynamic obstacles’ temporal information in observation to guide the agent’s decision-making. We proposed Multichannel Cost Map Observation for Spatial and Temporal Information (M-COST) to mitigate these limitations. Our results show that the M-COST approach more than doubles the convergence rate in concentrated tunnel situations, where successful navigation is only possible if the agent learns to avoid dynamic obstacles. Additionally, navigation efficiency improved by 35% in tunnel scenarios and by 12% in dense-environment navigation compared to standard methods that rely on raw sensor data or frame stacking.
format	Article
id	doaj-art-e691b21c57d44112818e43e87abe41e8
institution	Kabale University
issn	2218-6581
language	English
publishDate	2024-11-01
publisher	MDPI AG
record_format	Article
series	Robotics
spelling	doaj-art-e691b21c57d44112818e43e87abe41e82024-11-26T18:20:41ZengMDPI AGRobotics2218-65812024-11-01131116610.3390/robotics13110166Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost MapsTareq A. Fahmy0Omar M. Shehata1Shady A. Maged2Mechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptMechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptMechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptDeep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is contained in a single observation of the current state) for structuring the learning process; the partially observable Markov property in the DRL navigation problem is significantly amplified when dealing with dynamic obstacles. A single observation or measurement of the environment is often insufficient for capturing the dynamic behavior of obstacles, thereby hindering the agent’s decision-making. This study addresses this challenge by using an environment-specific heuristic approach to augment the dynamic obstacles’ temporal information in observation to guide the agent’s decision-making. We proposed Multichannel Cost Map Observation for Spatial and Temporal Information (M-COST) to mitigate these limitations. Our results show that the M-COST approach more than doubles the convergence rate in concentrated tunnel situations, where successful navigation is only possible if the agent learns to avoid dynamic obstacles. Additionally, navigation efficiency improved by 35% in tunnel scenarios and by 12% in dense-environment navigation compared to standard methods that rely on raw sensor data or frame stacking.https://www.mdpi.com/2218-6581/13/11/166deep reinforcement learningnavigationmultichannel cost maptrajectory awarespatial and temporal representationPOMDP (partially observable Markov decision process)
spellingShingle	Tareq A. Fahmy Omar M. Shehata Shady A. Maged Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps Robotics deep reinforcement learning navigation multichannel cost map trajectory aware spatial and temporal representation POMDP (partially observable Markov decision process)
title	Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
title_full	Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
title_fullStr	Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
title_full_unstemmed	Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
title_short	Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
title_sort	trajectory aware deep reinforcement learning navigation using multichannel cost maps
topic	deep reinforcement learning navigation multichannel cost map trajectory aware spatial and temporal representation POMDP (partially observable Markov decision process)
url	https://www.mdpi.com/2218-6581/13/11/166
work_keys_str_mv	AT tareqafahmy trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps AT omarmshehata trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps AT shadyamaged trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps

Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps

Similar Items