Criterion Decomposition for the Myopic Sequential Control of Uncertain Systems
A method is proposed for the sequential updating of criterion functions on the basis of past reward observations in analogy to Bayes' rule for the updating of probability distribution functions.