Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning
With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decompositio...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/9/1429 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850137658666254336 |
|---|---|
| author | Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen |
| author_facet | Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen |
| author_sort | Zhitong Zhao |
| collection | DOAJ |
| description | With the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge. |
| format | Article |
| id | doaj-art-ea376692d38348549d28bc01ecdfa717 |
| institution | OA Journals |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-ea376692d38348549d28bc01ecdfa7172025-08-20T02:30:46ZengMDPI AGMathematics2227-73902025-04-01139142910.3390/math13091429Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement LearningZhitong Zhao0Ya Zhang1Siying Wang2Yang Zhou3Ruoning Zhang4Wenyu Chen5College of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaCollege of Management Science, Chengdu University of Technology, Chengdu 610059, ChinaSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaWith the development of value decomposition methods, multi-agent reinforcement learning (MARL) has made significant progress in balancing autonomous decision making with collective cooperation. However, the collaborative dynamics among agents are continuously changing. The current value decomposition methods struggle to adeptly handle these dynamic changes, thereby impairing the effectiveness of cooperative policies. In this paper, we introduce the concept of latent interaction, upon which an innovative method for generating weights is developed. The proposed method derives weights from the history information, thereby enhancing the accuracy of value estimations. Building upon this, we further propose a dynamic masking mechanism that recalibrates history information in response to the activity level of agents, improving the precision of latent interaction assessments. Experimental results demonstrate the improved training speed and superior performance of the proposed method in both a multi-agent particle environment and the StarCraft Multi-Agent Challenge.https://www.mdpi.com/2227-7390/13/9/1429reinforcement learningdecentralized partially observable Markov decision process (Dec-POMDP)multi-agent reinforcement learningmulti-agent value decomposition |
| spellingShingle | Zhitong Zhao Ya Zhang Siying Wang Yang Zhou Ruoning Zhang Wenyu Chen Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning Mathematics reinforcement learning decentralized partially observable Markov decision process (Dec-POMDP) multi-agent reinforcement learning multi-agent value decomposition |
| title | Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning |
| title_full | Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning |
| title_fullStr | Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning |
| title_full_unstemmed | Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning |
| title_short | Assisted-Value Factorization with Latent Interaction in Cooperate Multi-Agent Reinforcement Learning |
| title_sort | assisted value factorization with latent interaction in cooperate multi agent reinforcement learning |
| topic | reinforcement learning decentralized partially observable Markov decision process (Dec-POMDP) multi-agent reinforcement learning multi-agent value decomposition |
| url | https://www.mdpi.com/2227-7390/13/9/1429 |
| work_keys_str_mv | AT zhitongzhao assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT yazhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT siyingwang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT yangzhou assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT ruoningzhang assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning AT wenyuchen assistedvaluefactorizationwithlatentinteractionincooperatemultiagentreinforcementlearning |