On boundedness of Q-learning iterates for stochastic shortest path problems
| Year of publication: |
2013
|
|---|---|
| Authors: | Yu, Huizhen ; Bertsekas, Dimitri P. |
| Published in: |
Mathematics of operations research. - Catonsville, MD : INFORMS, ISSN 0364-765X, ZDB-ID 195683-8. - Vol. 38.2013, 2, p. 209-227
|
| Subject: | Markov decision processes | Q-learning | stochastic approximation | dynamic programming | reinforcement learning | Theorie | Theory | Markov-Kette | Markov chain | Dynamische Optimierung | Dynamic programming | Stochastischer Prozess | Stochastic process | Lernprozess | Learning process | Mathematische Optimierung | Mathematical programming | Entscheidung | Decision |
-
Pakiman, Parshan, (2025)
-
Optimising darts strategy using Markov decision processes and reinforcement learning
Baird, Graham, (2020)
-
Envelope theorems for multistage linear stochastic optimization
Terça, Gonçalo, (2021)
- More ...
-
Q-learning and enhanced policy iteration in discounted dynamic programming
Bertsekas, Dimitri P., (2012)
-
On near optimality of the set of finite-state controllers for average cost POMDP
Yu, Huizhen, (2008)
-
Error bounds for approximations from projected linear equations
Yu, Huizhen, (2010)
- More ...