Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation

In recent years, unmanned aerial vehicles (UAVs) have been considered for many applications, such as disaster prevention and control, logistics and transportation, and wireless communication. Most UAVs need to be manually controlled using remote control, which can be challenging in many environments...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhipeng Wang, Soon Xin Ng, Mohammed EI-Hajjar
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Open Journal of Vehicular Technology
Subjects:	Autonomous navigation cumulative reward model deep reinforcement learning experience replay region segmentation UAV path planning
Online Access:	https://ieeexplore.ieee.org/document/10531630/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850233697767260160
author	Zhipeng Wang Soon Xin Ng Mohammed EI-Hajjar
author_facet	Zhipeng Wang Soon Xin Ng Mohammed EI-Hajjar
author_sort	Zhipeng Wang
collection	DOAJ
description	In recent years, unmanned aerial vehicles (UAVs) have been considered for many applications, such as disaster prevention and control, logistics and transportation, and wireless communication. Most UAVs need to be manually controlled using remote control, which can be challenging in many environments. Therefore, autonomous UAVs have attracted significant research interest, where most of the existing autonomous navigation algorithms suffer from long computation time and unsatisfactory performance. Hence, we propose a Deep Reinforcement Learning (DRL) UAV path planning algorithm based on cumulative reward and region segmentation. Our proposed region segmentation aims to reduce the probability of DRL agents falling into local optimal trap, while our proposed cumulative reward model takes into account the distance from the node to the destination and the density of obstacles near the node, which solves the problem of sparse training data faced by the DRL algorithms in the path planning task. The proposed region segmentation algorithm and cumulative reward model have been tested in different DRL techniques, where we show that the cumulative reward model can improve the training efficiency of deep neural networks by 30.8% and the region segmentation algorithm enables deep Q-network agent to avoid 99% of local optimal traps and assists deep deterministic policy gradient agent to avoid 92% of local optimal traps.
format	Article
id	doaj-art-e243e7fd9097432f965e9dab4c17b7de
institution	OA Journals
issn	2644-1330
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Open Journal of Vehicular Technology
spelling	doaj-art-e243e7fd9097432f965e9dab4c17b7de2025-08-20T02:02:51ZengIEEEIEEE Open Journal of Vehicular Technology2644-13302024-01-01573775110.1109/OJVT.2024.340212910531630Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region SegmentationZhipeng Wang0https://orcid.org/0009-0004-1940-1047Soon Xin Ng1https://orcid.org/0000-0002-0930-7194Mohammed EI-Hajjar2https://orcid.org/0000-0002-7987-1401School of Electronics and Computer Science, University of Southampton, Southampton, U.K.School of Electronics and Computer Science, University of Southampton, Southampton, U.K.School of Electronics and Computer Science, University of Southampton, Southampton, U.K.In recent years, unmanned aerial vehicles (UAVs) have been considered for many applications, such as disaster prevention and control, logistics and transportation, and wireless communication. Most UAVs need to be manually controlled using remote control, which can be challenging in many environments. Therefore, autonomous UAVs have attracted significant research interest, where most of the existing autonomous navigation algorithms suffer from long computation time and unsatisfactory performance. Hence, we propose a Deep Reinforcement Learning (DRL) UAV path planning algorithm based on cumulative reward and region segmentation. Our proposed region segmentation aims to reduce the probability of DRL agents falling into local optimal trap, while our proposed cumulative reward model takes into account the distance from the node to the destination and the density of obstacles near the node, which solves the problem of sparse training data faced by the DRL algorithms in the path planning task. The proposed region segmentation algorithm and cumulative reward model have been tested in different DRL techniques, where we show that the cumulative reward model can improve the training efficiency of deep neural networks by 30.8% and the region segmentation algorithm enables deep Q-network agent to avoid 99% of local optimal traps and assists deep deterministic policy gradient agent to avoid 92% of local optimal traps.https://ieeexplore.ieee.org/document/10531630/Autonomous navigationcumulative reward modeldeep reinforcement learningexperience replayregion segmentationUAV path planning
spellingShingle	Zhipeng Wang Soon Xin Ng Mohammed EI-Hajjar Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation IEEE Open Journal of Vehicular Technology Autonomous navigation cumulative reward model deep reinforcement learning experience replay region segmentation UAV path planning
title	Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
title_full	Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
title_fullStr	Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
title_full_unstemmed	Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
title_short	Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
title_sort	deep reinforcement learning assisted uav path planning relying on cumulative reward mode and region segmentation
topic	Autonomous navigation cumulative reward model deep reinforcement learning experience replay region segmentation UAV path planning
url	https://ieeexplore.ieee.org/document/10531630/
work_keys_str_mv	AT zhipengwang deepreinforcementlearningassisteduavpathplanningrelyingoncumulativerewardmodeandregionsegmentation AT soonxinng deepreinforcementlearningassisteduavpathplanningrelyingoncumulativerewardmodeandregionsegmentation AT mohammedeihajjar deepreinforcementlearningassisteduavpathplanningrelyingoncumulativerewardmodeandregionsegmentation

Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation

Similar Items