Showing 1 - 7 of 7
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This...
Persistent link: https://www.econbiz.de/10008865135
This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy optimization in a prescribed subset of the state space, and a novel result is...
Persistent link: https://www.econbiz.de/10010939789
This paper is a companion paper of Leão et al. (Probab. Statist. Lett. 42 (1999) 409; Proyecciones--J. Math. 23 (2004) 15), in the sense that we carry out further studies on the properties of Radon spaces, proposed in Leão (Radon spaces, Ph.D. Thesis 1999). For instance, we study topological...
Persistent link: https://www.econbiz.de/10005313921
Persistent link: https://www.econbiz.de/10005277514
Persistent link: https://www.econbiz.de/10007900003
Persistent link: https://www.econbiz.de/10006659465
Persistent link: https://www.econbiz.de/10006784706