Arruda, E.F.; Fragoso, M.D.; do Val, J.B.R. - In: European Journal of Operational Research 211 (2011) 2, pp. 343-351
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This...