Haviv, Moshe - In: Stochastic Processes and their Applications 19 (1985) 1, pp. 151-160
In this paper we suggest a new successive approximation method to compute the optimal discounted reward for finite state and action, discrete time, discounted Markov decision chains. The method is based on a block partitioning of the (stochastic) matrices corresponding to the stationary...