Comparative Evaluation of Mean Cumulative Regret in Multi-Armed Bandit Algorithms: ETC, UCB, Asymptotically Optimal UCB, and TS
This research provides insights into how to address short-term and long-term decision-making in different kinds of the Multi-Armed Bandit (MAB) problem, a classic problem in decision-making under uncertainty. In this study, four algorithms - Explore-Then-Commit (ETC), the Upper Confidence Bound (UCB...
Saved in:
| Main Author: | Lei Yicong |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
EDP Sciences
2025-01-01
|
| Series: | ITM Web of Conferences |
| Online Access: | https://www.itm-conferences.org/articles/itmconf/pdf/2025/04/itmconf_iwadi2024_01026.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
by: Lai Wei, et al.
Published: (2024-01-01) -
Numerical analysis of springback with experimental validation using UCB test
by: Rogério Lopes, et al.
Published: (2024-01-01) -
YOLOv8-UCB: Visual Detection of Pouch Battery Using Improved YOLOv8
by: Hao Hao, et al.
Published: (2024-01-01) -
MSC-EVs and UCB-EVs promote skin wound healing and spatial transcriptome analysis
by: Ruonan Li, et al.
Published: (2025-02-01) -
Client aware adaptive federated learning using UCB-based reinforcement for people re-identification
by: Dinah Waref, et al.
Published: (2025-05-01)