A policy gradient algorithm for the risk-sensitive exponential cost MDP
| Year of publication: |
2025
|
|---|---|
| Authors: | Moharrami, Mehrdad ; Murthy, Yashaswini ; Roy, Arghyadip ; Srikant, Rayadurgam |
| Published in: |
Mathematics of operations research. - Hanover, Md. : INFORMS, ISSN 1526-5471, ZDB-ID 2004273-5. - Vol. 50.2025, 1, p. 431-458
|
| Subject: | policy gradient theorem | reinforcement learning | risk-sensitive Markov decision processes | stochastic approximation | Theorie | Theory | Markov-Kette | Markov chain | Mathematische Optimierung | Mathematical programming | Algorithmus | Algorithm | Stochastischer Prozess | Stochastic process |
-
A hybrid deep learning method for optimal insurance strategies : algorithms and convergence analysis
Jin, Zhuo, (2021)
-
Wang, Mengdi, (2020)
-
On boundedness of Q-learning iterates for stochastic shortest path problems
Yu, Huizhen, (2013)
- More ...
-
Winnicki, Anna, (2025)
-
The power of slightly more than one sample in randomized load balancing
Ying, Lei, (2017)
-
Heavy-traffic insensitive bounds for weighted proportionally fair bandwidth sharing policies
Wang, Weina, (2022)
- More ...