DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments
Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-Based Proximal Policy Optimization (<i>DTPP...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-11-01
|
Series: | Drones |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-446X/8/12/720 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846104992883146752 |
---|---|
author | Anning Wei Jintao Liang Kaiyuan Lin Ziyue Li Rui Zhao |
author_facet | Anning Wei Jintao Liang Kaiyuan Lin Ziyue Li Rui Zhao |
author_sort | Anning Wei |
collection | DOAJ |
description | Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-Based Proximal Policy Optimization (<i>DTPPO</i>) method. DTPPO enhances multi-UAV collaboration through a Spatial Transformer, which models inter-agent dynamics, and a Temporal Transformer, which captures temporal dependencies to improve generalization across diverse environments. This architecture allows UAVs to navigate new, unseen environments without retraining. Extensive simulations demonstrate that DTPPO outperforms current MADRL methods in terms of transferability, obstacle avoidance, and navigation efficiency across environments with varying obstacle densities. The results confirm DTPPO’s effectiveness as a robust solution for multi-UAV navigation in both known and unseen scenarios. |
format | Article |
id | doaj-art-5e3fa4bf906b4122883aec9725527edb |
institution | Kabale University |
issn | 2504-446X |
language | English |
publishDate | 2024-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Drones |
spelling | doaj-art-5e3fa4bf906b4122883aec9725527edb2024-12-27T14:21:46ZengMDPI AGDrones2504-446X2024-11-0181272010.3390/drones8120720DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex EnvironmentsAnning Wei0Jintao Liang1Kaiyuan Lin2Ziyue Li3Rui Zhao4Department of Automation, Tsinghua University, Beijing 100190, ChinaState Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaViterbi School of Engineering, University of Southern California, Los Angeles, CA 90007, USADepartment of Information Systems, University of Cologne, 50923 Köln, GermanySenseTime ResearchExisting multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-Based Proximal Policy Optimization (<i>DTPPO</i>) method. DTPPO enhances multi-UAV collaboration through a Spatial Transformer, which models inter-agent dynamics, and a Temporal Transformer, which captures temporal dependencies to improve generalization across diverse environments. This architecture allows UAVs to navigate new, unseen environments without retraining. Extensive simulations demonstrate that DTPPO outperforms current MADRL methods in terms of transferability, obstacle avoidance, and navigation efficiency across environments with varying obstacle densities. The results confirm DTPPO’s effectiveness as a robust solution for multi-UAV navigation in both known and unseen scenarios.https://www.mdpi.com/2504-446X/8/12/720multi-UAV navigationpartially observable Markov decision processmulti-agent deep reinforcement learningcross-scenario transferability |
spellingShingle | Anning Wei Jintao Liang Kaiyuan Lin Ziyue Li Rui Zhao DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments Drones multi-UAV navigation partially observable Markov decision process multi-agent deep reinforcement learning cross-scenario transferability |
title | DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments |
title_full | DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments |
title_fullStr | DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments |
title_full_unstemmed | DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments |
title_short | DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments |
title_sort | dtppo dual transformer encoder based proximal policy optimization for multi uav navigation in unseen complex environments |
topic | multi-UAV navigation partially observable Markov decision process multi-agent deep reinforcement learning cross-scenario transferability |
url | https://www.mdpi.com/2504-446X/8/12/720 |
work_keys_str_mv | AT anningwei dtppodualtransformerencoderbasedproximalpolicyoptimizationformultiuavnavigationinunseencomplexenvironments AT jintaoliang dtppodualtransformerencoderbasedproximalpolicyoptimizationformultiuavnavigationinunseencomplexenvironments AT kaiyuanlin dtppodualtransformerencoderbasedproximalpolicyoptimizationformultiuavnavigationinunseencomplexenvironments AT ziyueli dtppodualtransformerencoderbasedproximalpolicyoptimizationformultiuavnavigationinunseencomplexenvironments AT ruizhao dtppodualtransformerencoderbasedproximalpolicyoptimizationformultiuavnavigationinunseencomplexenvironments |