A cooperative multi-agent reinforcement learning algorithm based on dynamic self-selection parameters sharing

In multi-agent reinforcement learning, parameter sharing can effectively alleviate the inefficiency of learning caused by non-stationarity.However, maintaining the same policy forall agents during learning may have detrimental effects.To solve this problem, a new approach was introduced to give agen...

Full description

Saved in:
Bibliographic Details
Main Authors: Han WANG, Yang YU, Yuan JIANG
Format: Article
Language:zho
Published: POSTS&TELECOM PRESS Co., LTD 2022-03-01
Series:智能科学与技术学报
Subjects:
Online Access:http://www.cjist.com.cn/thesisDetails#10.11959/j.issn.2096-6652.202214
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In multi-agent reinforcement learning, parameter sharing can effectively alleviate the inefficiency of learning caused by non-stationarity.However, maintaining the same policy forall agents during learning may have detrimental effects.To solve this problem, a new approach was introduced to give agents the ability to automatically identify agents that may benefit from parameter sharing and dynamically share parameters them during learning.Specifically, agents needed to encode empirical trajectories as implicit information that can represent their potential intentions, and selected peers to share parameters by comparing their intentions.Experiments show that the proposed method not only can improve the efficiency of parameter sharing, but also ensure the quality of policy learning in multi-agent system.
ISSN:2096-6652