Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

This work concerns finte-state Markov decision chains endowed with the long-run average reward criterion. Assuming that the optimality equation has a solution, it is shown that a nearly optimal stationary policy, as well as an approximation to the optimal average reward within a specified error, can be obtained in a finite number of steps of the value iteration method. These results extend others already available in the literature, which were established under more stringent restrictions on the ergodic structure of the decision process. Copyright Springer-Verlag Berlin Heidelberg 2002

MoreLess

Year of publication:	2002
Authors:	Cavazos-Cadena, Rolando ; Cavazos-Cadena, Rolando
Published in:	Computational Statistics. - Springer. - Vol. 56.2002, 2, p. 181-196
Publisher:	Springer
Subject:	AMS Subject Classifications. Primary \| Secondary \| Key words: Successive approximations \| Markov decision processes \| Schweitzer's Transformation \| Optimality Equation \| Convergence of the value iteration approximations

More details

Extent:	text/html
Type of publication:	Article
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10010759438