EconBiz - Find Economic Literature
    • Logout
    • Change account settings
  • A-Z
  • Beta
  • About EconBiz
  • News
  • Thesaurus (STW)
  • Academic Skills
  • Help
  •  My account 
    • Logout
    • Change account settings
  • Login
EconBiz - Find Economic Literature
Publications Events
Search options
Advanced Search history
My EconBiz
Favorites Loans Reservations Fines
    You are here:
  • Home
  • Search: subject:"Unbounded reward"
Narrow search

Narrow search

Year of publication
Subject
All
Unbounded reward 3 Average criteria 2 Constrained-optimal policy 2 Continuous-time Markov decision process 2 Discrete-time Markov decision process 2 Optimal stationary policy 2 Sample-path reward criterion 2 Unbounded reward/cost and transition rates 2 Variance-maximization 2 Multi-armed bandit 1 Sub-exponential reward 1 Upper confidence bound 1
more ... less ...
Online availability
All
Undetermined 5
Type of publication
All
Article 5
Type of publication (narrower categories)
All
Article in journal 1 Aufsatz in Zeitschrift 1
Language
All
Undetermined 4 English 1
Author
All
Guo, Xianping 2 Zhang, Lanlan 2 Zhu, Q. 2 Jia, Huiwen 1 Shen, Siqian 1 Shi, Cong 1
Published in...
All
Computational Statistics 2 Mathematical Methods of Operations Research 2 Operations research letters 1
Source
All
RePEc 4 ECONIS (ZBW) 1
Showing 1 - 5 of 5
Cover Image
Multi-armed bandit with sub-exponential rewards
Jia, Huiwen; Shi, Cong; Shen, Siqian - In: Operations research letters 49 (2021) 5, pp. 728-733
Persistent link: https://www.econbiz.de/10013207439
Saved in:
Cover Image
Constrained continuous-time Markov decision processes with average criteria
Zhang, Lanlan; Guo, Xianping - In: Computational Statistics 67 (2008) 2, pp. 323-340
unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is …
Persistent link: https://www.econbiz.de/10010759169
Saved in:
Cover Image
Constrained continuous-time Markov decision processes with average criteria
Zhang, Lanlan; Guo, Xianping - In: Mathematical Methods of Operations Research 67 (2008) 2, pp. 323-340
unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is …
Persistent link: https://www.econbiz.de/10010949960
Saved in:
Cover Image
Sample-path optimality and variance-maximization for Markov decision processes
Zhu, Q. - In: Computational Statistics 65 (2007) 3, pp. 519-538
This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system’s primitive data and...
Persistent link: https://www.econbiz.de/10010759507
Saved in:
Cover Image
Sample-path optimality and variance-maximization for Markov decision processes
Zhu, Q. - In: Mathematical Methods of Operations Research 65 (2007) 3, pp. 519-538
This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system’s primitive data and...
Persistent link: https://www.econbiz.de/10010950302
Saved in:
A service of the
zbw
  • Sitemap
  • Plain language
  • Accessibility
  • Contact us
  • Imprint
  • Privacy

Loading...