Policy Similarity Measure for Two-Player Zero-Sum Games
Policy space response oracles (PSRO) is an important algorithmic framework for approximating Nash equilibria in two-player zero-sum games. Enhancing policy diversity has been shown to improve the performance of PSRO in this approximation process significantly. However, existing diversity metrics are...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/5/2815 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850031416603049984 |
|---|---|
| author | Hongsong Tang Liuyu Xiang Zhaofeng He |
| author_facet | Hongsong Tang Liuyu Xiang Zhaofeng He |
| author_sort | Hongsong Tang |
| collection | DOAJ |
| description | Policy space response oracles (PSRO) is an important algorithmic framework for approximating Nash equilibria in two-player zero-sum games. Enhancing policy diversity has been shown to improve the performance of PSRO in this approximation process significantly. However, existing diversity metrics are often prone to redundancy, which can hinder optimal strategy convergence. In this paper, we introduce the policy similarity measure (PSM), a novel approach that combines Gaussian and cosine similarity measures to assess policy similarity. We further incorporate the PSM into the PSRO framework as a regularization term, effectively fostering a more diverse policy population. We demonstrate the effectiveness of our method in two distinct game environments: a non-transitive mixture model and Leduc poker. The experimental results show that the PSM-augmented PSRO outperforms baseline methods in reducing exploitability by approximately 7% and exhibits greater policy diversity in visual analysis. Ablation studies further validate the benefits of combining Gaussian and cosine similarities in cultivating more diverse policy sets. This work provides a valuable method for measuring and improving the policy diversity in two-player zero-sum games. |
| format | Article |
| id | doaj-art-489f08c9d5eb4f95bd04bf8c4806acce |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-489f08c9d5eb4f95bd04bf8c4806acce2025-08-20T02:58:58ZengMDPI AGApplied Sciences2076-34172025-03-01155281510.3390/app15052815Policy Similarity Measure for Two-Player Zero-Sum GamesHongsong Tang0Liuyu Xiang1Zhaofeng He2School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaPolicy space response oracles (PSRO) is an important algorithmic framework for approximating Nash equilibria in two-player zero-sum games. Enhancing policy diversity has been shown to improve the performance of PSRO in this approximation process significantly. However, existing diversity metrics are often prone to redundancy, which can hinder optimal strategy convergence. In this paper, we introduce the policy similarity measure (PSM), a novel approach that combines Gaussian and cosine similarity measures to assess policy similarity. We further incorporate the PSM into the PSRO framework as a regularization term, effectively fostering a more diverse policy population. We demonstrate the effectiveness of our method in two distinct game environments: a non-transitive mixture model and Leduc poker. The experimental results show that the PSM-augmented PSRO outperforms baseline methods in reducing exploitability by approximately 7% and exhibits greater policy diversity in visual analysis. Ablation studies further validate the benefits of combining Gaussian and cosine similarities in cultivating more diverse policy sets. This work provides a valuable method for measuring and improving the policy diversity in two-player zero-sum games.https://www.mdpi.com/2076-3417/15/5/2815game theoryreinforcement learningmulti-agent systemspolicy diversity |
| spellingShingle | Hongsong Tang Liuyu Xiang Zhaofeng He Policy Similarity Measure for Two-Player Zero-Sum Games Applied Sciences game theory reinforcement learning multi-agent systems policy diversity |
| title | Policy Similarity Measure for Two-Player Zero-Sum Games |
| title_full | Policy Similarity Measure for Two-Player Zero-Sum Games |
| title_fullStr | Policy Similarity Measure for Two-Player Zero-Sum Games |
| title_full_unstemmed | Policy Similarity Measure for Two-Player Zero-Sum Games |
| title_short | Policy Similarity Measure for Two-Player Zero-Sum Games |
| title_sort | policy similarity measure for two player zero sum games |
| topic | game theory reinforcement learning multi-agent systems policy diversity |
| url | https://www.mdpi.com/2076-3417/15/5/2815 |
| work_keys_str_mv | AT hongsongtang policysimilaritymeasurefortwoplayerzerosumgames AT liuyuxiang policysimilaritymeasurefortwoplayerzerosumgames AT zhaofenghe policysimilaritymeasurefortwoplayerzerosumgames |