Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation

Autonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage...

Full description

Saved in:
Bibliographic Details
Main Authors: Mahdi Shahbazi Khojasteh, Armin Salimi-Badr
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Open Journal of Vehicular Technology
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10758436/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850264709178064896
author Mahdi Shahbazi Khojasteh
Armin Salimi-Badr
author_facet Mahdi Shahbazi Khojasteh
Armin Salimi-Badr
author_sort Mahdi Shahbazi Khojasteh
collection DOAJ
description Autonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage structure comprising a depth estimation and a decision-making module. The former module uses a convolutional encoder-decoder network to learn image depth from visual cues self-supervised, with the output serving as input for the latter module. The latter module uses dueling double deep recurrent Q-learning to make decisions in high-dimensional and partially observable state spaces. To reduce meaningless explorations, we introduce the Insight Memory Pool alongside the regular memory pool to provide a rapid boost in learning by emphasizing early sampling from it and relying on the agent's experiences later. Once the agent has gained enough knowledge from the insightful data, we transition to a targeted exploration phase by employing the Boltzmann behavior policy, which relies on the refined Q-value estimates. To validate our approach, we tested the model in three diverse environments simulated with AirSim: a dynamic city street, a downtown, and a pillar world, each with different weather conditions. Experimental results show that our method significantly improves success rates and demonstrates strong generalization across various starting points and environmental transformations.
format Article
id doaj-art-994f66629a7444d5ac353e0824dea03f
institution OA Journals
issn 2644-1330
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Open Journal of Vehicular Technology
spelling doaj-art-994f66629a7444d5ac353e0824dea03f2025-08-20T01:54:38ZengIEEEIEEE Open Journal of Vehicular Technology2644-13302025-01-016345110.1109/OJVT.2024.350229610758436Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth EstimationMahdi Shahbazi Khojasteh0https://orcid.org/0009-0007-6262-9460Armin Salimi-Badr1https://orcid.org/0000-0001-6613-7921Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, IranFaculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, IranAutonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage structure comprising a depth estimation and a decision-making module. The former module uses a convolutional encoder-decoder network to learn image depth from visual cues self-supervised, with the output serving as input for the latter module. The latter module uses dueling double deep recurrent Q-learning to make decisions in high-dimensional and partially observable state spaces. To reduce meaningless explorations, we introduce the Insight Memory Pool alongside the regular memory pool to provide a rapid boost in learning by emphasizing early sampling from it and relying on the agent's experiences later. Once the agent has gained enough knowledge from the insightful data, we transition to a targeted exploration phase by employing the Boltzmann behavior policy, which relies on the refined Q-value estimates. To validate our approach, we tested the model in three diverse environments simulated with AirSim: a dynamic city street, a downtown, and a pillar world, each with different weather conditions. Experimental results show that our method significantly improves success rates and demonstrates strong generalization across various starting points and environmental transformations.https://ieeexplore.ieee.org/document/10758436/Deep reinforcement learningexperience replayinsight memory pool (IMP)path planningQ-learningautonomous aerial vehicle (AAV)
spellingShingle Mahdi Shahbazi Khojasteh
Armin Salimi-Badr
Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
IEEE Open Journal of Vehicular Technology
Deep reinforcement learning
experience replay
insight memory pool (IMP)
path planning
Q-learning
autonomous aerial vehicle (AAV)
title Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
title_full Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
title_fullStr Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
title_full_unstemmed Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
title_short Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
title_sort autonomous quadrotor path planning through deep reinforcement learning with monocular depth estimation
topic Deep reinforcement learning
experience replay
insight memory pool (IMP)
path planning
Q-learning
autonomous aerial vehicle (AAV)
url https://ieeexplore.ieee.org/document/10758436/
work_keys_str_mv AT mahdishahbazikhojasteh autonomousquadrotorpathplanningthroughdeepreinforcementlearningwithmonoculardepthestimation
AT arminsalimibadr autonomousquadrotorpathplanningthroughdeepreinforcementlearningwithmonoculardepthestimation