Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment

The development of artificial intelligence (AI) game agents that use deep reinforcement learning (DRL) algorithms to process visual information for decision-making has emerged as a key research focus in both academia and industry. However, previous game agents have struggled to execute multiple comm...

Full description

Saved in:
Bibliographic Details
Main Authors: Zihao Cui, Kailian Deng, Hongtao Zhang, Zhongyi Zha, Sayed Jobaer
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/5/754
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850228227573809152
author Zihao Cui
Kailian Deng
Hongtao Zhang
Zhongyi Zha
Sayed Jobaer
author_facet Zihao Cui
Kailian Deng
Hongtao Zhang
Zhongyi Zha
Sayed Jobaer
author_sort Zihao Cui
collection DOAJ
description The development of artificial intelligence (AI) game agents that use deep reinforcement learning (DRL) algorithms to process visual information for decision-making has emerged as a key research focus in both academia and industry. However, previous game agents have struggled to execute multiple commands simultaneously in a single decision, failing to accurately replicate the complex control patterns that characterize human gameplay. In this paper, we utilize the ViZDoom environment as the DRL research platform and transform the agent–environment interactions into a Partially Observable Markov Decision Process (POMDP). We introduce an advanced multi-agent deep reinforcement learning (DRL) framework, specifically a Multi-Agent Proximal Policy Optimization (MA-PPO), designed to optimize target acquisition while operating within defined ammunition and time constraints. In MA-PPO, each agent handles distinct parallel tasks with custom reward functions for performance evaluation. The agents make independent decisions while simultaneously executing multiple commands to mimic human-like gameplay behavior. Our evaluation compares MA-PPO against other DRL algorithms, showing a 30.67% performance improvement over the baseline algorithm.
format Article
id doaj-art-372b7597abab459f9b787b43511ab9e2
institution OA Journals
issn 2227-7390
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-372b7597abab459f9b787b43511ab9e22025-08-20T02:04:36ZengMDPI AGMathematics2227-73902025-02-0113575410.3390/math13050754Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex EnvironmentZihao Cui0Kailian Deng1Hongtao Zhang2Zhongyi Zha3Sayed Jobaer4College of Information Science and Technology, Donghua University, Shanghai 201620, ChinaCollege of Information Science and Technology, Donghua University, Shanghai 201620, ChinaCollege of Information Science and Technology, Donghua University, Shanghai 201620, ChinaCollege of Information Science and Technology, Donghua University, Shanghai 201620, ChinaCollege of Information Science and Technology, Donghua University, Shanghai 201620, ChinaThe development of artificial intelligence (AI) game agents that use deep reinforcement learning (DRL) algorithms to process visual information for decision-making has emerged as a key research focus in both academia and industry. However, previous game agents have struggled to execute multiple commands simultaneously in a single decision, failing to accurately replicate the complex control patterns that characterize human gameplay. In this paper, we utilize the ViZDoom environment as the DRL research platform and transform the agent–environment interactions into a Partially Observable Markov Decision Process (POMDP). We introduce an advanced multi-agent deep reinforcement learning (DRL) framework, specifically a Multi-Agent Proximal Policy Optimization (MA-PPO), designed to optimize target acquisition while operating within defined ammunition and time constraints. In MA-PPO, each agent handles distinct parallel tasks with custom reward functions for performance evaluation. The agents make independent decisions while simultaneously executing multiple commands to mimic human-like gameplay behavior. Our evaluation compares MA-PPO against other DRL algorithms, showing a 30.67% performance improvement over the baseline algorithm.https://www.mdpi.com/2227-7390/13/5/754deep reinforcement learningconvolution neural networkpartially observable Markov decision processmulti-agent system
spellingShingle Zihao Cui
Kailian Deng
Hongtao Zhang
Zhongyi Zha
Sayed Jobaer
Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
Mathematics
deep reinforcement learning
convolution neural network
partially observable Markov decision process
multi-agent system
title Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
title_full Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
title_fullStr Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
title_full_unstemmed Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
title_short Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
title_sort deep reinforcement learning based multi agent system with advanced actor critic framework for complex environment
topic deep reinforcement learning
convolution neural network
partially observable Markov decision process
multi-agent system
url https://www.mdpi.com/2227-7390/13/5/754
work_keys_str_mv AT zihaocui deepreinforcementlearningbasedmultiagentsystemwithadvancedactorcriticframeworkforcomplexenvironment
AT kailiandeng deepreinforcementlearningbasedmultiagentsystemwithadvancedactorcriticframeworkforcomplexenvironment
AT hongtaozhang deepreinforcementlearningbasedmultiagentsystemwithadvancedactorcriticframeworkforcomplexenvironment
AT zhongyizha deepreinforcementlearningbasedmultiagentsystemwithadvancedactorcriticframeworkforcomplexenvironment
AT sayedjobaer deepreinforcementlearningbasedmultiagentsystemwithadvancedactorcriticframeworkforcomplexenvironment