Selective Reviews of Bandit Problems in AI via a Statistical View
Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes multi-armed bandit (MAB) and stochastic continuum-armed bandit (SCAB) problems, which model sequential...
Saved in:
| Main Authors: | Pengjie Zhou, Haoyu Wei, Huiming Zhang |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/4/665 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Adaptive Noise Exploration for Neural Contextual Multi-Armed Bandits
by: Chi Wang, et al.
Published: (2025-01-01) -
Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
by: Lai Wei, et al.
Published: (2024-01-01) -
Model-based exploration is measurable across tasks but not linked to personality and psychiatric assessments
by: Kristin Witte, et al.
Published: (2025-07-01) -
Gaussian Process with Vine Copula-Based Context Modeling for Contextual Multi-Armed Bandits
by: Jong-Min Kim
Published: (2025-06-01) -
Causal contextual bandits with one-shot data integration
by: Chandrasekar Subramanian, et al.
Published: (2024-12-01)