Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
Year of publication: |
2008
|
---|---|
Authors: | Kunnumkal, Sumit ; Topaloğlu, Hüseyin |
Published in: |
INFORMS journal on computing : JOC. - Catonsville, MD : INFORMS, ISSN 1091-9856, ZDB-ID 1316077-1. - Vol. 20.2008, 2, p. 288-301
|
Subject: | Theorie | Theory | Markov-Kette | Markov chain | Algorithmus | Algorithm | Entscheidung | Decision | Mathematische Optimierung | Mathematical programming |
-
A unified algorithm framework for mean-variance optimization in discounted Markov decision processes
Ma, Shuai, (2023)
-
Development of a hybrid model to plan segment based optimal promotion strategy
Ekinci, Yeliz, (2023)
-
Improved and generalized upper bounds on the complexity of policy iteration
Scherrer, Bruno, (2016)
- More ...
-
Kunnumkal, Sumit, (2010)
-
A randomized linear program for the network revenue management problem with customer choice behavior
Kunnumkal, Sumit, (2011)
-
A randomized linear programming method for network revenue management with product-specific no-shows
Kunnumkal, Sumit, (2011)
- More ...