A policy gradient algorithm for the risk-sensitive exponential cost MDP
Year of publication: |
2025
|
---|---|
Authors: | Moharrami, Mehrdad ; Murthy, Yashaswini ; Roy, Arghyadip ; Srikant, Rayadurgam |
Published in: |
Mathematics of operations research. - Hanover, Md. : INFORMS, ISSN 1526-5471, ZDB-ID 2004273-5. - Vol. 50.2025, 1, p. 431-458
|
Subject: | policy gradient theorem | reinforcement learning | risk-sensitive Markov decision processes | stochastic approximation | Theorie | Theory | Markov-Kette | Markov chain | Mathematische Optimierung | Mathematical programming | Algorithmus | Algorithm | Stochastischer Prozess | Stochastic process |
-
Wang, Mengdi, (2020)
-
A hybrid deep learning method for optimal insurance strategies : algorithms and convergence analysis
Jin, Zhuo, (2021)
-
On boundedness of Q-learning iterates for stochastic shortest path problems
Yu, Huizhen, (2013)
- More ...
-
The power of slightly more than one sample in randomized load balancing
Ying, Lei, (2017)
-
Heavy-traffic insensitive bounds for weighted proportionally fair bandwidth sharing policies
Wang, Weina, (2022)
-
Asymptotic Behavior of Internet Congestion Controllers in a Many-Flows Regime
Deb, Supratim, (2005)
- More ...