GOULIONIS, JOHN; STENGOS, D. - In: International Journal of Information Technology & … 10 (2011) 06, pp. 1175-1197
This paper treats the infinite horizon discounted cost control problem for partially observable Markov decision processes. Sondik studied the class of finitely transient policies and showed that their value functions over an infinite time horizon are piecewise linear (p.w.l) and can be computed...