Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning

Multi-agent reinforcement learning (MARL) is characterized by its simple structure and strong adaptability, which has led to its widespread application in the field of path planning. To address the challenge of optimal path planning for mobile agent clusters in uncertain environments, a multi-object...

Full description

Saved in:
Bibliographic Details
Main Authors: Mengxue Tao, Qiang Li, Junxi Yu
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Journal of Marine Science and Engineering
Subjects:
Online Access:https://www.mdpi.com/2077-1312/13/1/20
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588307723190272
author Mengxue Tao
Qiang Li
Junxi Yu
author_facet Mengxue Tao
Qiang Li
Junxi Yu
author_sort Mengxue Tao
collection DOAJ
description Multi-agent reinforcement learning (MARL) is characterized by its simple structure and strong adaptability, which has led to its widespread application in the field of path planning. To address the challenge of optimal path planning for mobile agent clusters in uncertain environments, a multi-objective dynamic path planning model (MODPP) based on multi-agent deep reinforcement learning (MADRL) has been proposed. This model is suitable for complex, unstable task environments characterized by dimensionality explosion and offers scalability. The approach consists of two components: an action evaluation module and an action decision module, utilizing a centralized training with decentralized execution (CTDE) training architecture. During the training process, agents within the cluster learn cooperative strategies while being able to communicate with one another. Consequently, they can navigate through task environments without communication, achieving collision-free paths that optimize multiple sub-objectives globally, minimizing time, distance, and overall costs associated with turning. Furthermore, in real-task execution, agents acting as mobile entities can perform real-time obstacle avoidance. Finally, based on the OpenAI Gym platform, environments such as simple multi-objective environment and complex multi-objective environment were designed to analyze the rationality and effectiveness of the multi-objective dynamic path planning through minimum cost and collision risk assessments. Additionally, the impact of reward function configuration on agent strategies was discussed.
format Article
id doaj-art-c8f6bece0d71414db89b9c0175765bb2
institution Kabale University
issn 2077-1312
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Journal of Marine Science and Engineering
spelling doaj-art-c8f6bece0d71414db89b9c0175765bb22025-01-24T13:36:34ZengMDPI AGJournal of Marine Science and Engineering2077-13122024-12-011312010.3390/jmse13010020Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement LearningMengxue Tao0Qiang Li1Junxi Yu2Navigation College, Dalian Maritime University, Dalian 116026, ChinaNavigation College, Dalian Maritime University, Dalian 116026, ChinaNavigation College, Dalian Maritime University, Dalian 116026, ChinaMulti-agent reinforcement learning (MARL) is characterized by its simple structure and strong adaptability, which has led to its widespread application in the field of path planning. To address the challenge of optimal path planning for mobile agent clusters in uncertain environments, a multi-objective dynamic path planning model (MODPP) based on multi-agent deep reinforcement learning (MADRL) has been proposed. This model is suitable for complex, unstable task environments characterized by dimensionality explosion and offers scalability. The approach consists of two components: an action evaluation module and an action decision module, utilizing a centralized training with decentralized execution (CTDE) training architecture. During the training process, agents within the cluster learn cooperative strategies while being able to communicate with one another. Consequently, they can navigate through task environments without communication, achieving collision-free paths that optimize multiple sub-objectives globally, minimizing time, distance, and overall costs associated with turning. Furthermore, in real-task execution, agents acting as mobile entities can perform real-time obstacle avoidance. Finally, based on the OpenAI Gym platform, environments such as simple multi-objective environment and complex multi-objective environment were designed to analyze the rationality and effectiveness of the multi-objective dynamic path planning through minimum cost and collision risk assessments. Additionally, the impact of reward function configuration on agent strategies was discussed.https://www.mdpi.com/2077-1312/13/1/20multi-agent reinforcement learningmulti-objective dynamic path planningswarm intelligence
spellingShingle Mengxue Tao
Qiang Li
Junxi Yu
Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
Journal of Marine Science and Engineering
multi-agent reinforcement learning
multi-objective dynamic path planning
swarm intelligence
title Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
title_full Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
title_fullStr Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
title_full_unstemmed Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
title_short Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning
title_sort multi objective dynamic path planning with multi agent deep reinforcement learning
topic multi-agent reinforcement learning
multi-objective dynamic path planning
swarm intelligence
url https://www.mdpi.com/2077-1312/13/1/20
work_keys_str_mv AT mengxuetao multiobjectivedynamicpathplanningwithmultiagentdeepreinforcementlearning
AT qiangli multiobjectivedynamicpathplanningwithmultiagentdeepreinforcementlearning
AT junxiyu multiobjectivedynamicpathplanningwithmultiagentdeepreinforcementlearning