Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic ϵt-switching in the ϵt-g...

Full description

Saved in:
Bibliographic Details
Main Authors: Hyeong Soo Chang, Sanghee Choe
Format: Article
Language:English
Published: Wiley 2015-01-01
Series:Journal of Control Science and Engineering
Online Access:http://dx.doi.org/10.1155/2015/264953
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic ϵt-switching in the ϵt-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of {ϵt}.
ISSN:1687-5249
1687-5257