Vector-valued Markov decision processes and the systems of linear inequalities
For a vector-valued Markov decision process, we characterize optimal (deterministic) stationary policies by systems of linear inequalities and present an algorithm for finding all optimal stationary policies from among all randomized, history-remembering ones. The algorithm consists of improving the policies and of checking the optimality of a policy by solving the associated system of linear inequalities via Fourier elimination.
Year of publication: |
1995
|
---|---|
Authors: | Wakuta, Kazuyoshi |
Published in: |
Stochastic Processes and their Applications. - Elsevier, ISSN 0304-4149. - Vol. 56.1995, 1, p. 159-169
|
Publisher: |
Elsevier |
Keywords: | Dynamic programming Markov decision process Multiobjective Linear inequalities Fourier elimination |
Saved in:
Saved in favorites
Similar items by person
-
A first-passage problem with multiple costs
Wakuta, Kazuyoshi, (2000)
-
A note on the structure of value spaces in vector-valued Markov decision processes
Wakuta, Kazuyoshi, (1999)
-
Semi-Markov decision processes with incomplete state observation, average cost criterion
Wakuta, Kazuyoshi, (1981)
- More ...