Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning

With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decompositio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhitong Zhao, Ya Zhang, Siying Wang, Yang Zhou, Ruoning Zhang, Wenyu Chen
Format:	Article
Language:	English
Published:	MDPI AG 2025-04-01
Series:	Mathematics
Subjects:	reinforcement learning decentralized partially observable Markov decision process (Dec-POMDP) multi-agent reinforcement learning multi-agent value decomposition
Online Access:	https://www.mdpi.com/2227-7390/13/9/1429
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850137658666254336
author	Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen
author_facet	Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen
author_sort	Zhitong Zhao
collection	DOAJ
description	With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge.
format	Article
id	doaj-art-ea376692d38348549d28bc01ecdfa717
institution	OA Journals
issn	2227-7390
language	English
publishDate	2025-04-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj-art-ea376692d38348549d28bc01ecdfa7172025-08-20T02:30:46ZengMDPI AGMathematics2227-73902025-04-01139142910.3390/math13091429Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement LearningZhitong Zhao0Ya Zhang1Siying Wang2Yang Zhou3Ruoning Zhang4Wenyu Chen5College of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaCollege of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaWith the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge.https://www.mdpi.com/2227-7390/13/9/1429reinforcement learningdecentralized partially observable Markov decision process (Dec-POMDP)multi-agent reinforcement learningmulti-agent value decomposition
spellingShingle	Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning Mathematics reinforcement learning decentralized partially observable Markov decision process (Dec-POMDP) multi-agent reinforcement learning multi-agent value decomposition
title	Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_full	Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_fullStr	Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_full_unstemmed	Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_short	Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_sort	assisted value factorization with latent interaction in cooperate multi agent reinforcement learning
topic	reinforcement learning decentralized partially observable Markov decision process (Dec-POMDP) multi-agent reinforcement learning multi-agent value decomposition
url	https://www.mdpi.com/2227-7390/13/9/1429
work_keys_str_mv	AT zhitongzhao assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT yazhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT siyingwang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT yangzhou assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT ruoningzhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT wenyuchen assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning

Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning

Similar Items