Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments

Path planning in multi-agent UAV swarms is a crucial issue that involves avoiding collisions in dynamic, obstacle-filled environments while consuming the least amount of time and energy possible. This work comprehensively evaluates reinforcement learning (RL) algorithms for multi-agent UAV path plan...

Full description

Saved in:
Bibliographic Details
Main Authors: Mirza Aqib Ali, Adnan Maqsood, Usama Athar, Hasan Raza Khanzada
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/6/438
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849472312453103616
author Mirza Aqib Ali
Adnan Maqsood
Usama Athar
Hasan Raza Khanzada
author_facet Mirza Aqib Ali
Adnan Maqsood
Usama Athar
Hasan Raza Khanzada
author_sort Mirza Aqib Ali
collection DOAJ
description Path planning in multi-agent UAV swarms is a crucial issue that involves avoiding collisions in dynamic, obstacle-filled environments while consuming the least amount of time and energy possible. This work comprehensively evaluates reinforcement learning (RL) algorithms for multi-agent UAV path planning in 2D and 3D simulated environments. First, we develop a 2D simulation setup using Python in which UAVs (quadcopters), represented as points in space, navigate toward their respective targets while avoiding static obstacles and inter-agent collisions. In the second phase, we transition this comparison to a physics-based 3D simulation, incorporating realistic UAV (fixed wing) dynamics and checkpoint-based navigation. We compared five algorithms, namely, Proximal Policy Optimization (PPO), Soft Actor–Critic (SAC), Deep Deterministic Policy Gradient (DDPG), Trust Region Policy Optimization (TRPO), and Multi–Agent DDPG (MADDPG), in various scenarios. Our findings reveal significant performance differences between the algorithms across multiple dimensions. DDPG consistently demonstrated superior reward optimization and collision avoidance performance, while PPO and MADDPG excelled in the execution time required to reach the goal. Furthermore, our findings reveal how algorithms perform while transitioning from a simplistic 2D setup to a realistic 3D physics-based environment, which is essential for performing sim-to-real transfer. This work provides valuable insights into the suitability of several reinforcement learning (RL) algorithms for developing autonomous systems and UAV swarm navigation.
format Article
id doaj-art-80da8a57320c4eea8390b18f47eb8129
institution Kabale University
issn 2504-446X
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-80da8a57320c4eea8390b18f47eb81292025-08-20T03:24:34ZengMDPI AGDrones2504-446X2025-06-019643810.3390/drones9060438Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D EnvironmentsMirza Aqib Ali0Adnan Maqsood1Usama Athar2Hasan Raza Khanzada3School of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology, Islamabad 44000, PakistanSchool of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology, Islamabad 44000, PakistanSchool of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology, Islamabad 44000, PakistanSchool of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology, Islamabad 44000, PakistanPath planning in multi-agent UAV swarms is a crucial issue that involves avoiding collisions in dynamic, obstacle-filled environments while consuming the least amount of time and energy possible. This work comprehensively evaluates reinforcement learning (RL) algorithms for multi-agent UAV path planning in 2D and 3D simulated environments. First, we develop a 2D simulation setup using Python in which UAVs (quadcopters), represented as points in space, navigate toward their respective targets while avoiding static obstacles and inter-agent collisions. In the second phase, we transition this comparison to a physics-based 3D simulation, incorporating realistic UAV (fixed wing) dynamics and checkpoint-based navigation. We compared five algorithms, namely, Proximal Policy Optimization (PPO), Soft Actor–Critic (SAC), Deep Deterministic Policy Gradient (DDPG), Trust Region Policy Optimization (TRPO), and Multi–Agent DDPG (MADDPG), in various scenarios. Our findings reveal significant performance differences between the algorithms across multiple dimensions. DDPG consistently demonstrated superior reward optimization and collision avoidance performance, while PPO and MADDPG excelled in the execution time required to reach the goal. Furthermore, our findings reveal how algorithms perform while transitioning from a simplistic 2D setup to a realistic 3D physics-based environment, which is essential for performing sim-to-real transfer. This work provides valuable insights into the suitability of several reinforcement learning (RL) algorithms for developing autonomous systems and UAV swarm navigation.https://www.mdpi.com/2504-446X/9/6/438multi-agent systemsUAV swarmspath planningreinforcement learningautonomous systemsUAV navigation
spellingShingle Mirza Aqib Ali
Adnan Maqsood
Usama Athar
Hasan Raza Khanzada
Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
Drones
multi-agent systems
UAV swarms
path planning
reinforcement learning
autonomous systems
UAV navigation
title Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
title_full Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
title_fullStr Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
title_full_unstemmed Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
title_short Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments
title_sort comparative evaluation of reinforcement learning algorithms for multi agent unmanned aerial vehicle path planning in 2d and 3d environments
topic multi-agent systems
UAV swarms
path planning
reinforcement learning
autonomous systems
UAV navigation
url https://www.mdpi.com/2504-446X/9/6/438
work_keys_str_mv AT mirzaaqibali comparativeevaluationofreinforcementlearningalgorithmsformultiagentunmannedaerialvehiclepathplanningin2dand3denvironments
AT adnanmaqsood comparativeevaluationofreinforcementlearningalgorithmsformultiagentunmannedaerialvehiclepathplanningin2dand3denvironments
AT usamaathar comparativeevaluationofreinforcementlearningalgorithmsformultiagentunmannedaerialvehiclepathplanningin2dand3denvironments
AT hasanrazakhanzada comparativeevaluationofreinforcementlearningalgorithmsformultiagentunmannedaerialvehiclepathplanningin2dand3denvironments