Showing 1 - 10 of 61
In a multi-armed bandit (MAB) problem a gambler needs to choose at each round of play one of K arms, each characterized by an unknown reward distribution. Reward realizations are only observed when an arm is selected, and the gambler's objective is to maximize cumulative expected earnings over...
Persistent link: https://www.econbiz.de/10012856685
We consider a non-stationary variant of a sequential stochastic optimization problem, where the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact...
Persistent link: https://www.econbiz.de/10013035332
Persistent link: https://www.econbiz.de/10011397831
We consider a single product revenue management problem where, given an initial inventory, the objective is to dynamically adjust prices over a finite sales horizon to maximize expected revenues. Realized demand is observed over time, but the underlying functional relationship between price and...
Persistent link: https://www.econbiz.de/10013119422
We consider a platform facilitating trade between sellers and buyers with the objective of maximizing consumer surplus. Even though in many such marketplaces prices are set by revenue-maximizing sellers, platforms can influence prices through (i) price-dependent promotion policies that can...
Persistent link: https://www.econbiz.de/10012847343
Persistent link: https://www.econbiz.de/10012607132
Sequential experiments are deployed in a variety of practices, including for optimizing product recommendations and pricing in online platforms. Such experiments are often characterized by an exploration-exploitation tradeoff that is well-understood when at each time period feedback is received...
Persistent link: https://www.econbiz.de/10013218225
Persistent link: https://www.econbiz.de/10014393037
In repeated games, strategies are often evaluated by their ability to guarantee the performance of the single best action that is selected in hindsight (a property referred to as Hannan consistency, or no-regret). However, the effectiveness of the single best action as a yardstick to evaluate...
Persistent link: https://www.econbiz.de/10014264316
Many operational settings share the following three features: (i) a centralized planning system allocates tasks to workers or service providers, (ii) the providers generate value by completing the tasks, and (iii) the completion of tasks influences the providers' welfare. In such cases, the...
Persistent link: https://www.econbiz.de/10012065219