Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning

With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decompositio...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhitong Zhao, Ya Zhang, Siying Wang, Yang Zhou, Ruoning Zhang, Wenyu Chen
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/9/1429
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850137658666254336
author Zhitong Zhao
Ya Zhang
Siying Wang
Yang Zhou
Ruoning Zhang
Wenyu Chen
author_facet Zhitong Zhao
Ya Zhang
Siying Wang
Yang Zhou
Ruoning Zhang
Wenyu Chen
author_sort Zhitong Zhao
collection DOAJ
description With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge.
format Article
id doaj-art-ea376692d38348549d28bc01ecdfa717
institution OA Journals
issn 2227-7390
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-ea376692d38348549d28bc01ecdfa7172025-08-20T02:30:46ZengMDPI AGMathematics2227-73902025-04-01139142910.3390/math13091429Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement LearningZhitong Zhao0Ya Zhang1Siying Wang2Yang Zhou3Ruoning Zhang4Wenyu Chen5College of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaCollege of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaWith the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge.https://www.mdpi.com/2227-7390/13/9/1429reinforcement learningdecentralized partially observable Markov decision process (Dec-POMDP)multi-agent reinforcement learningmulti-agent value decomposition
spellingShingle Zhitong Zhao
Ya Zhang
Siying Wang
Yang Zhou
Ruoning Zhang
Wenyu Chen
Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
Mathematics
reinforcement learning
decentralized partially observable Markov decision process (Dec-POMDP)
multi-agent reinforcement learning
multi-agent value decomposition
title Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_full Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_fullStr Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_full_unstemmed Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_short Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
title_sort assisted value factorization with latent interaction in cooperate multi agent reinforcement learning
topic reinforcement learning
decentralized partially observable Markov decision process (Dec-POMDP)
multi-agent reinforcement learning
multi-agent value decomposition
url https://www.mdpi.com/2227-7390/13/9/1429
work_keys_str_mv AT zhitongzhao assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning
AT yazhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning
AT siyingwang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning
AT yangzhou assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning
AT ruoningzhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning
AT wenyuchen assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning