Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps
Deep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is c...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-11-01
|
Series: | Robotics |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-6581/13/11/166 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846152531514753024 |
---|---|
author | Tareq A. Fahmy Omar M. Shehata Shady A. Maged |
author_facet | Tareq A. Fahmy Omar M. Shehata Shady A. Maged |
author_sort | Tareq A. Fahmy |
collection | DOAJ |
description | Deep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is contained in a single observation of the current state) for structuring the learning process; the partially observable Markov property in the DRL navigation problem is significantly amplified when dealing with dynamic obstacles. A single observation or measurement of the environment is often insufficient for capturing the dynamic behavior of obstacles, thereby hindering the agent’s decision-making. This study addresses this challenge by using an environment-specific heuristic approach to augment the dynamic obstacles’ temporal information in observation to guide the agent’s decision-making. We proposed Multichannel Cost Map Observation for Spatial and Temporal Information (M-COST) to mitigate these limitations. Our results show that the M-COST approach more than doubles the convergence rate in concentrated tunnel situations, where successful navigation is only possible if the agent learns to avoid dynamic obstacles. Additionally, navigation efficiency improved by 35% in tunnel scenarios and by 12% in dense-environment navigation compared to standard methods that rely on raw sensor data or frame stacking. |
format | Article |
id | doaj-art-e691b21c57d44112818e43e87abe41e8 |
institution | Kabale University |
issn | 2218-6581 |
language | English |
publishDate | 2024-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Robotics |
spelling | doaj-art-e691b21c57d44112818e43e87abe41e82024-11-26T18:20:41ZengMDPI AGRobotics2218-65812024-11-01131116610.3390/robotics13110166Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost MapsTareq A. Fahmy0Omar M. Shehata1Shady A. Maged2Mechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptMechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptMechatronics Engineering Department, Ain Shams University, Cairo 11535, EgyptDeep reinforcement learning (DRL)-based navigation in an environment with dynamic obstacles is a challenging task due to the partially observable nature of the problem. While DRL algorithms are built around the Markov property (assumption that all the necessary information for making a decision is contained in a single observation of the current state) for structuring the learning process; the partially observable Markov property in the DRL navigation problem is significantly amplified when dealing with dynamic obstacles. A single observation or measurement of the environment is often insufficient for capturing the dynamic behavior of obstacles, thereby hindering the agent’s decision-making. This study addresses this challenge by using an environment-specific heuristic approach to augment the dynamic obstacles’ temporal information in observation to guide the agent’s decision-making. We proposed Multichannel Cost Map Observation for Spatial and Temporal Information (M-COST) to mitigate these limitations. Our results show that the M-COST approach more than doubles the convergence rate in concentrated tunnel situations, where successful navigation is only possible if the agent learns to avoid dynamic obstacles. Additionally, navigation efficiency improved by 35% in tunnel scenarios and by 12% in dense-environment navigation compared to standard methods that rely on raw sensor data or frame stacking.https://www.mdpi.com/2218-6581/13/11/166deep reinforcement learningnavigationmultichannel cost maptrajectory awarespatial and temporal representationPOMDP (partially observable Markov decision process) |
spellingShingle | Tareq A. Fahmy Omar M. Shehata Shady A. Maged Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps Robotics deep reinforcement learning navigation multichannel cost map trajectory aware spatial and temporal representation POMDP (partially observable Markov decision process) |
title | Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps |
title_full | Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps |
title_fullStr | Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps |
title_full_unstemmed | Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps |
title_short | Trajectory Aware Deep Reinforcement Learning Navigation Using Multichannel Cost Maps |
title_sort | trajectory aware deep reinforcement learning navigation using multichannel cost maps |
topic | deep reinforcement learning navigation multichannel cost map trajectory aware spatial and temporal representation POMDP (partially observable Markov decision process) |
url | https://www.mdpi.com/2218-6581/13/11/166 |
work_keys_str_mv | AT tareqafahmy trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps AT omarmshehata trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps AT shadyamaged trajectoryawaredeepreinforcementlearningnavigationusingmultichannelcostmaps |