An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem
Ryo Watanabe, Atsuyoshi Nakamura, Mineichi Kudo (Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan)
Year of publication: |
November 2015
|
---|---|
Authors: | Watanabe, Ryo ; Nakamarua, Atsuyoshi ; Kudo, Mineichi |
Published in: |
Operations research letters. - Amsterdam [u.a.] : Elsevier, ISSN 0167-6377, ZDB-ID 720735-9. - Vol. 43.2015, 6, p. 558-563
|
Subject: | Multi-armed bandit problem | Matching | Regret analysis | Combinatorial bandit | Online learning | Entscheidung | Decision | Lernprozess | Learning process | Spieltheorie | Game theory | Entscheidung unter Unsicherheit | Decision under uncertainty |
Saved in:
Online Resource
Saved in favorites
Similar items by subject
-
Walliser, Bernard, (2008)
-
The K-armed bandit problem with multiple priors
Li, Jian, (2019)
-
Bandit with similarity information
Radoc, Benjamin, (2020)
- More ...
Similar items by person