A Numerical Dynamic Programming Algorithm for Optimal Learning Problems
This paper presents a numerical nonlinear dynamic programming algorithm for solving so-called optimal learning or adaptive control problems. These are decision problems with unknown parameters where the decisionmaker updates beliefs by Bayes rule. The updating equations are nonlinear. As a result the dynamic decision problem exhibits mulitiple optima, nondifferentiability of the value function and discontinuity of the policy function. Computational complexity rises quickly as multiple state variables are need to describe the evolution of the decisionmaker's beliefs. The algorithm presented delivers approximations to optimal policies for a class of optimal learning problems.