Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning

Reinforcement learning (RL) has been shown to be effective in path planning. However, it usually requires exploring a sufficient number of state–action pairs, some of which may be unsafe when deployed in practical obstacle environments. To this end, this paper proposes an end-to-end planning method...

Full description

Saved in:
Bibliographic Details
Main Authors: Hong Chen, Dan Huang, Chenggang Wang, Lu Ding, Lei Song, Hongtao Liu
Format: Article
Language:English
Published: MDPI AG 2024-09-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/8/9/481
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850261171823706112
author Hong Chen
Dan Huang
Chenggang Wang
Lu Ding
Lei Song
Hongtao Liu
author_facet Hong Chen
Dan Huang
Chenggang Wang
Lu Ding
Lei Song
Hongtao Liu
author_sort Hong Chen
collection DOAJ
description Reinforcement learning (RL) has been shown to be effective in path planning. However, it usually requires exploring a sufficient number of state–action pairs, some of which may be unsafe when deployed in practical obstacle environments. To this end, this paper proposes an end-to-end planning method based model-free RL framework with optimization, which can achieve better learning performance with a safety guarantee. Firstly, for second-order drone systems, a differentiable high-order control barrier function (HOCBF) is introduced to ensure the output of the planning algorithm falls in a safe range. Then, a safety layer based on the HOCBF is proposed, which projects RL actions into a feasible solution set to guarantee safe exploration. Finally, we conducted a simulation for drone obstacle avoidance and validated the proposed method in the simulation environment. The experimental results demonstrate a significant enhancement over the baseline approach. Specifically, the proposed method achieved a substantial reduction in the average cumulative number of collisions per drone during training compared to the baseline. Additionally, in the testing phase, the proposed method realized a 43% improvement in the task success rate relative to the MADDPG.
format Article
id doaj-art-db990fb209bf414c8588491f33312aad
institution OA Journals
issn 2504-446X
language English
publishDate 2024-09-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-db990fb209bf414c8588491f33312aad2025-08-20T01:55:30ZengMDPI AGDrones2504-446X2024-09-018948110.3390/drones8090481Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement LearningHong Chen0Dan Huang1Chenggang Wang2Lu Ding3Lei Song4Hongtao Liu5School of Electrical Engineering, Guangxi University, Nanning 530004, ChinaSchool of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, ChinaSchool of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, ChinaSchool of Electrical Engineering, Guangxi University, Nanning 530004, ChinaSchool of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China92281 Branch, Zhucheng 262200, ChinaReinforcement learning (RL) has been shown to be effective in path planning. However, it usually requires exploring a sufficient number of state–action pairs, some of which may be unsafe when deployed in practical obstacle environments. To this end, this paper proposes an end-to-end planning method based model-free RL framework with optimization, which can achieve better learning performance with a safety guarantee. Firstly, for second-order drone systems, a differentiable high-order control barrier function (HOCBF) is introduced to ensure the output of the planning algorithm falls in a safe range. Then, a safety layer based on the HOCBF is proposed, which projects RL actions into a feasible solution set to guarantee safe exploration. Finally, we conducted a simulation for drone obstacle avoidance and validated the proposed method in the simulation environment. The experimental results demonstrate a significant enhancement over the baseline approach. Specifically, the proposed method achieved a substantial reduction in the average cumulative number of collisions per drone during training compared to the baseline. Additionally, in the testing phase, the proposed method realized a 43% improvement in the task success rate relative to the MADDPG.https://www.mdpi.com/2504-446X/8/9/481reinforcement learningcontrol barrier functionmultiple agents
spellingShingle Hong Chen
Dan Huang
Chenggang Wang
Lu Ding
Lei Song
Hongtao Liu
Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
Drones
reinforcement learning
control barrier function
multiple agents
title Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
title_full Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
title_fullStr Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
title_full_unstemmed Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
title_short Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
title_sort collision free path planning for multiple drones based on safe reinforcement learning
topic reinforcement learning
control barrier function
multiple agents
url https://www.mdpi.com/2504-446X/8/9/481
work_keys_str_mv AT hongchen collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning
AT danhuang collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning
AT chenggangwang collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning
AT luding collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning
AT leisong collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning
AT hongtaoliu collisionfreepathplanningformultipledronesbasedonsafereinforcementlearning