Showing 51 - 60 of 753,510
exploiting structural information that is commonly available in practice. We propose a novel learning algorithm that we call … viable learning policy for structured bandit problems that has asymptotic minimal regret …
Persistent link: https://www.econbiz.de/10012828319
Persistent link: https://www.econbiz.de/10012649515
Persistent link: https://www.econbiz.de/10012314127
Persistent link: https://www.econbiz.de/10012301569
Persistent link: https://www.econbiz.de/10012264495
Learning. Specifically, our study is framed in the context where a reference discrete optimization problem is given and there …
Persistent link: https://www.econbiz.de/10012390942
Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning introduces the evolving … optimization via temporal differences and Reinforcement Learning: Q-Learning, SARSA, and R-SMART algorithms, and policy search, via … API, Q-P-Learning, actor-critics, and learning automata · A special examination of neural-network-based function …
Persistent link: https://www.econbiz.de/10012402233
dynamic programming for discrete states -- Approximate dynamic programming and reinforcement learning for discrete states … -- Numerical dynamic programming for continuous states -- Approximate dynamic programming and reinforcement learning for continuous …
Persistent link: https://www.econbiz.de/10012422812
Persistent link: https://www.econbiz.de/10012430186
Persistent link: https://www.econbiz.de/10012588027