No-Regret Learning and Equilibrium Computation in Quantum Games

As quantum processors advance, the emergence of large-scale decentralized systems involving interacting quantum-enabled agents is on the horizon. Recent research efforts have explored quantum versions of Nash and correlated equilibria as solution concepts of strategic quantum interactions, but these...

Full description

Saved in:
Bibliographic Details
Main Authors: Wayne Lin, Georgios Piliouras, Ryann Sim, Antonios Varvitsiotis
Format: Article
Language:English
Published: Verein zur Förderung des Open Access Publizierens in den Quantenwissenschaften 2024-12-01
Series:Quantum
Online Access:https://quantum-journal.org/papers/q-2024-12-17-1569/pdf/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As quantum processors advance, the emergence of large-scale decentralized systems involving interacting quantum-enabled agents is on the horizon. Recent research efforts have explored quantum versions of Nash and correlated equilibria as solution concepts of strategic quantum interactions, but these approaches did not directly connect to decentralized adaptive setups where agents possess limited information. This paper delves into the dynamics of quantum-enabled agents within decentralized systems that employ no-regret algorithms to update their behaviors over time. Specifically, we investigate two-player quantum zero-sum games and polymatrix quantum zero-sum games, showing that no-regret algorithms converge to separable quantum Nash equilibria in time-average. In the case of general multi-player quantum games, our work leads to a novel solution concept, that of the separable quantum coarse correlated equilibria (QCCE), as the convergent outcome of the time-averaged behavior no-regret algorithms, offering a natural solution concept for decentralized quantum systems. Finally, we show that computing QCCEs can be formulated as a semidefinite program and establish the existence of entangled (i.e., non-separable) QCCEs, which are unlearnable via the current paradigm of no-regret learning.
ISSN:2521-327X