Memory-driven deep-reinforcement learning for autonomous robot navigation in partially observable environments

Service robots with autonomous navigational capabilities play a critical role in dynamic contexts where safe and collision-free human interactions are important. However, the unpredictable nature of human behavior, the prevalence of occlusions and the lack of complete environmental perception due to...

Full description

Saved in:
Bibliographic Details
Main Authors: Estrella Montero, Nabih Pico, Mitra Ghergherehchi, Ho Seung Song
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Engineering Science and Technology, an International Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2215098624003288
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Service robots with autonomous navigational capabilities play a critical role in dynamic contexts where safe and collision-free human interactions are important. However, the unpredictable nature of human behavior, the prevalence of occlusions and the lack of complete environmental perception due to sensor limitations can severely restrict effective robot navigation. We propose a memory-driven algorithm that employs deep reinforcement learning to enable collision-free proactive navigation in partially observable environments. The proposed method takes the relative states of humans within a limited FoV and sensor range as input into the neural network. The model employs a bidirectional gated recurrent unit as a temporal function to strategically incorporate the previous context of input sequences and facilitate the assimilation of the observations. This approach allows the model to assign greater attention to intricate human–robot relations, allowing a better understanding of the ever-changing dynamics within an environment. Simulations and experimental outcomes validate the efficacy of the policy-based navigation approach. It achieves superior collision avoidance performance compared to representative existing methods and exhibits efficient navigation by incorporating the limitations of sensors during training.
ISSN:2215-0986