Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment

This paper explores the application of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for model training to control UAV swarms in dynamic and adversarial scenarios. Using a modified Pac-Man environment, Pac-Man represents a target UAV, and Ghosts represents the UAV swarm that counteract...

Full description

Saved in:
Bibliographic Details
Main Authors: Artem Novikov, Sergiy Yakovlev, Ivan Gushchin
Format: Article
Language:English
Published: National Aerospace University «Kharkiv Aviation Institute» 2025-02-01
Series:Радіоелектронні і комп'ютерні системи
Subjects:
Online Access:http://nti.khai.edu/ojs/index.php/reks/article/view/2789
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850152068785897472
author Artem Novikov
Sergiy Yakovlev
Ivan Gushchin
author_facet Artem Novikov
Sergiy Yakovlev
Ivan Gushchin
author_sort Artem Novikov
collection DOAJ
description This paper explores the application of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for model training to control UAV swarms in dynamic and adversarial scenarios. Using a modified Pac-Man environment, Pac-Man represents a target UAV, and Ghosts represents the UAV swarm that counteracts it. The grid-based representation of Pac-Man mazes is used as an abstraction of a two-dimensional terrain model, which serves as a plane of pathways with obstacles that correspond to the UAV flight conditions at a certain altitude. The proposed approach provides a clear discretization of space, simplifying pathfinding, collision avoidance, and the planning of reconnaissance or interception routes by combining decentralized local autonomy with centralized training, which enables UAVs to coordinate effectively and quickly adapt to changing conditions. This study evaluates the performance of MADDPG-trained model-controlled adversaries against heuristic navigation strategies, such as A* and Breadth-First Search (BFS). Traditional Rule-Based Pursuit and Prediction Algorithms inspired by the behaviors of Blinky and Pinky ghosts from the classic Pac-Man game are included as benchmarks to assess the impact of learning-based methods. The purpose of this study was to determine the effectiveness of MADDPG-trained models in enhancing UAV swarm control by analyzing its adaptability and coordination capabilities in adversarial environments by computer modeling in simplified missions-like 2D environments. Experiments conducted across varying levels of terrain complexity revealed that MADDPG-trained model demonstrated superior adaptability and strategic coordination compared to the rule-based methods. Ghosts controlled by a model trained via MADDPG significantly reduce the success rate of Pac-Man agents, particularly in highly constrained environments, emphasizing the potential of learning-based adversarial strategies in UAV applications such as urban navigation, defense, and surveillance. Conclusions. MADDPG is a promising robust framework for training models to control UAV swarms, particularly in adversarial settings. This study highlights its adaptability and ability to outperform traditional rule-based methods in dynamic and complex environments. Future research should focus on comparing the effectiveness of MADDPG-trained models with multi-agent algorithms, such as Expectimax, Alpha-Beta Pruning, and Monte Carlo Tree Search (MCTS), to further understand the advantages and limitations of learning-based approaches compared with traditional decision-making methods in collaborative and adversarial UAV operations. Additionally, the exploration of 3D implementations, integrating maze height decomposition and flight restrictions, as well as incorporating cybersecurity considerations and real-world threats like anti-drone systems and electronic warfare, will enhance the robustness and applicability of these methods in realistic UAV scenarios.
format Article
id doaj-art-6c828b33328d4aeb8a57652f9dd79856
institution OA Journals
issn 1814-4225
2663-2012
language English
publishDate 2025-02-01
publisher National Aerospace University «Kharkiv Aviation Institute»
record_format Article
series Радіоелектронні і комп'ютерні системи
spelling doaj-art-6c828b33328d4aeb8a57652f9dd798562025-08-20T02:26:04ZengNational Aerospace University «Kharkiv Aviation Institute»Радіоелектронні і комп'ютерні системи1814-42252663-20122025-02-012025132733710.32620/reks.2025.1.212457Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environmentArtem Novikov0Sergiy Yakovlev1Ivan Gushchin2Institute of Computer Science and Artificial Intelligence at V. N. Karazin Kharkiv National University, KharkivInstitute of Computer Science and Artificial Intelligence at V. N. Karazin Kharkiv National University, Kharkiv, Ukraine; Institute of Mathematics, Lodz Universi-ty of Technology, LodzSoftatlas, KharkivThis paper explores the application of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for model training to control UAV swarms in dynamic and adversarial scenarios. Using a modified Pac-Man environment, Pac-Man represents a target UAV, and Ghosts represents the UAV swarm that counteracts it. The grid-based representation of Pac-Man mazes is used as an abstraction of a two-dimensional terrain model, which serves as a plane of pathways with obstacles that correspond to the UAV flight conditions at a certain altitude. The proposed approach provides a clear discretization of space, simplifying pathfinding, collision avoidance, and the planning of reconnaissance or interception routes by combining decentralized local autonomy with centralized training, which enables UAVs to coordinate effectively and quickly adapt to changing conditions. This study evaluates the performance of MADDPG-trained model-controlled adversaries against heuristic navigation strategies, such as A* and Breadth-First Search (BFS). Traditional Rule-Based Pursuit and Prediction Algorithms inspired by the behaviors of Blinky and Pinky ghosts from the classic Pac-Man game are included as benchmarks to assess the impact of learning-based methods. The purpose of this study was to determine the effectiveness of MADDPG-trained models in enhancing UAV swarm control by analyzing its adaptability and coordination capabilities in adversarial environments by computer modeling in simplified missions-like 2D environments. Experiments conducted across varying levels of terrain complexity revealed that MADDPG-trained model demonstrated superior adaptability and strategic coordination compared to the rule-based methods. Ghosts controlled by a model trained via MADDPG significantly reduce the success rate of Pac-Man agents, particularly in highly constrained environments, emphasizing the potential of learning-based adversarial strategies in UAV applications such as urban navigation, defense, and surveillance. Conclusions. MADDPG is a promising robust framework for training models to control UAV swarms, particularly in adversarial settings. This study highlights its adaptability and ability to outperform traditional rule-based methods in dynamic and complex environments. Future research should focus on comparing the effectiveness of MADDPG-trained models with multi-agent algorithms, such as Expectimax, Alpha-Beta Pruning, and Monte Carlo Tree Search (MCTS), to further understand the advantages and limitations of learning-based approaches compared with traditional decision-making methods in collaborative and adversarial UAV operations. Additionally, the exploration of 3D implementations, integrating maze height decomposition and flight restrictions, as well as incorporating cybersecurity considerations and real-world threats like anti-drone systems and electronic warfare, will enhance the robustness and applicability of these methods in realistic UAV scenarios.http://nti.khai.edu/ojs/index.php/reks/article/view/2789multi-agent reinforcement learningnavigationadversarial uav strategiescomputer modelling
spellingShingle Artem Novikov
Sergiy Yakovlev
Ivan Gushchin
Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
Радіоелектронні і комп'ютерні системи
multi-agent reinforcement learning
navigation
adversarial uav strategies
computer modelling
title Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
title_full Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
title_fullStr Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
title_full_unstemmed Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
title_short Exploring the possibilities of MADDPG for UAV swarm control by simulating in Pac-Man environment
title_sort exploring the possibilities of maddpg for uav swarm control by simulating in pac man environment
topic multi-agent reinforcement learning
navigation
adversarial uav strategies
computer modelling
url http://nti.khai.edu/ojs/index.php/reks/article/view/2789
work_keys_str_mv AT artemnovikov exploringthepossibilitiesofmaddpgforuavswarmcontrolbysimulatinginpacmanenvironment
AT sergiyyakovlev exploringthepossibilitiesofmaddpgforuavswarmcontrolbysimulatinginpacmanenvironment
AT ivangushchin exploringthepossibilitiesofmaddpgforuavswarmcontrolbysimulatinginpacmanenvironment