Search Results - subject:"Unbounded reward"

Multi-armed bandit with sub-exponential rewards

Jia, Huiwen; Shi, Cong; Shen, Siqian - In: Operations research letters 49 (2021) 5, pp. 728-733

Persistent link: https://www.econbiz.de/10013207439

Constrained continuous-time Markov decision processes with average criteria

Zhang, Lanlan; Guo, Xianping - In: Computational Statistics 67 (2008) 2, pp. 323-340

unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is …

Persistent link: https://www.econbiz.de/10010759169

Constrained continuous-time Markov decision processes with average criteria

Zhang, Lanlan; Guo, Xianping - In: Mathematical Methods of Operations Research 67 (2008) 2, pp. 323-340

unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is …

Persistent link: https://www.econbiz.de/10010949960

Sample-path optimality and variance-maximization for Markov decision processes

Zhu, Q. - In: Computational Statistics 65 (2007) 3, pp. 519-538

This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system’s primitive data and...

Persistent link: https://www.econbiz.de/10010759507

Sample-path optimality and variance-maximization for Markov decision processes

Zhu, Q. - In: Mathematical Methods of Operations Research 65 (2007) 3, pp. 519-538

Persistent link: https://www.econbiz.de/10010950302