Bertsekas, Dimitri P. - 2010
Approximate policy iteration methods basedon temporal differences are popular in practice, and havebeen tested extensively, dating to the early nineties, butthe associated convergence behavior is complex, and notwell understood at present. An important question iswhether the policy iteration...