Showing 1 - 5 of 5
Persistent link: https://www.econbiz.de/10012630720
Persistent link: https://www.econbiz.de/10014307657
Persistent link: https://www.econbiz.de/10015145600
We consider online no-regret learning in unknown games with bandit feedback, where each agent only observes its reward at each time -- determined by all players' current joint action -- rather than its gradient. We focus on the class of smooth and strongly monotone games and study optimal...
Persistent link: https://www.econbiz.de/10013312210
This paper proposes a general framework/meta-policy to solve Revenue Management (RM) problems with demand learning and potentially large action space, constrained by initial unreplenishable resources. This framework combines the technique of primal-dual method in optimization and...
Persistent link: https://www.econbiz.de/10014088662