Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation
Autonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Open Journal of Vehicular Technology |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10758436/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850264709178064896 |
|---|---|
| author | Mahdi Shahbazi Khojasteh Armin Salimi-Badr |
| author_facet | Mahdi Shahbazi Khojasteh Armin Salimi-Badr |
| author_sort | Mahdi Shahbazi Khojasteh |
| collection | DOAJ |
| description | Autonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage structure comprising a depth estimation and a decision-making module. The former module uses a convolutional encoder-decoder network to learn image depth from visual cues self-supervised, with the output serving as input for the latter module. The latter module uses dueling double deep recurrent Q-learning to make decisions in high-dimensional and partially observable state spaces. To reduce meaningless explorations, we introduce the Insight Memory Pool alongside the regular memory pool to provide a rapid boost in learning by emphasizing early sampling from it and relying on the agent's experiences later. Once the agent has gained enough knowledge from the insightful data, we transition to a targeted exploration phase by employing the Boltzmann behavior policy, which relies on the refined Q-value estimates. To validate our approach, we tested the model in three diverse environments simulated with AirSim: a dynamic city street, a downtown, and a pillar world, each with different weather conditions. Experimental results show that our method significantly improves success rates and demonstrates strong generalization across various starting points and environmental transformations. |
| format | Article |
| id | doaj-art-994f66629a7444d5ac353e0824dea03f |
| institution | OA Journals |
| issn | 2644-1330 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Open Journal of Vehicular Technology |
| spelling | doaj-art-994f66629a7444d5ac353e0824dea03f2025-08-20T01:54:38ZengIEEEIEEE Open Journal of Vehicular Technology2644-13302025-01-016345110.1109/OJVT.2024.350229610758436Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth EstimationMahdi Shahbazi Khojasteh0https://orcid.org/0009-0007-6262-9460Armin Salimi-Badr1https://orcid.org/0000-0001-6613-7921Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, IranFaculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, IranAutonomous navigation is a formidable challenge for autonomous aerial vehicles operating in dense or dynamic environments. This paper proposes a path-planning approach based on deep reinforcement learning for a quadrotor equipped with only a monocular camera. The proposed method employs a two-stage structure comprising a depth estimation and a decision-making module. The former module uses a convolutional encoder-decoder network to learn image depth from visual cues self-supervised, with the output serving as input for the latter module. The latter module uses dueling double deep recurrent Q-learning to make decisions in high-dimensional and partially observable state spaces. To reduce meaningless explorations, we introduce the Insight Memory Pool alongside the regular memory pool to provide a rapid boost in learning by emphasizing early sampling from it and relying on the agent's experiences later. Once the agent has gained enough knowledge from the insightful data, we transition to a targeted exploration phase by employing the Boltzmann behavior policy, which relies on the refined Q-value estimates. To validate our approach, we tested the model in three diverse environments simulated with AirSim: a dynamic city street, a downtown, and a pillar world, each with different weather conditions. Experimental results show that our method significantly improves success rates and demonstrates strong generalization across various starting points and environmental transformations.https://ieeexplore.ieee.org/document/10758436/Deep reinforcement learningexperience replayinsight memory pool (IMP)path planningQ-learningautonomous aerial vehicle (AAV) |
| spellingShingle | Mahdi Shahbazi Khojasteh Armin Salimi-Badr Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation IEEE Open Journal of Vehicular Technology Deep reinforcement learning experience replay insight memory pool (IMP) path planning Q-learning autonomous aerial vehicle (AAV) |
| title | Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation |
| title_full | Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation |
| title_fullStr | Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation |
| title_full_unstemmed | Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation |
| title_short | Autonomous Quadrotor Path Planning Through Deep Reinforcement Learning With Monocular Depth Estimation |
| title_sort | autonomous quadrotor path planning through deep reinforcement learning with monocular depth estimation |
| topic | Deep reinforcement learning experience replay insight memory pool (IMP) path planning Q-learning autonomous aerial vehicle (AAV) |
| url | https://ieeexplore.ieee.org/document/10758436/ |
| work_keys_str_mv | AT mahdishahbazikhojasteh autonomousquadrotorpathplanningthroughdeepreinforcementlearningwithmonoculardepthestimation AT arminsalimibadr autonomousquadrotorpathplanningthroughdeepreinforcementlearningwithmonoculardepthestimation |