Similar Search Results

Non-Stationary Reinforcement Learning : The Blessing of (More) Optimism

Cheung, Wang Chi; Simchi-Levi, David; Zhu, Ruihao - 2021

Motivated by applications in inventory control and real-time bidding, we consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under temporal drifts. In this setting, both the reward and state transition distributions are allowed to evolve over time, as long as...

Persistent link: https://www.econbiz.de/10014105917

Inventory Balancing with Online Learning

Cheung, Wang Chi - 2018

We study a general problem of allocating limited resources to heterogeneous customers over time, under model uncertainty. Each type of customer can be serviced using different actions, each of which stochastically consumes some combination of resources, and returns different rewards for the...

Persistent link: https://www.econbiz.de/10012912567

Nonstationary reinforcement learning : the blessing of (more) optimism

Cheung, Wang Chi; Simchi-Levi, David; Zhu, Ruihao - In: Management science : journal of the Institute for … 69 (2023) 10, pp. 5722-5739

Persistent link: https://www.econbiz.de/10014392977

Bandits atop reinforcement learning : tackling online inventory models with cyclic demands

Gong, Xiao-Yue; Simchi-Levi, David - In: Management science : journal of the Institute for … 70 (2024) 9, pp. 6139-6157

Persistent link: https://www.econbiz.de/10015138035

Dynamic pricing and demand learning with limited price experimentation

Cheung, Wang Chi; Simchi-Levi, David; Wang, He - In: Operations research 65 (2017) 6, pp. 1722-1731

Persistent link: https://www.econbiz.de/10011777913

Bandits Atop Reinforcement Learning : Tackling Online Inventory Models With Cyclic Demands

Gong, Xiao-Yue; Simchi-Levi, David - 2021

Motivated by a long-standing gap between inventory theory and practice, we study online inventory models with unknown cyclic demand distributions. We design efficient bandits-atop-reinforcement-learning algorithms that cater to the structure of inventory problems. We apply the standard...

Persistent link: https://www.econbiz.de/10013210711

Dynamic Pricing and Demand Learning with Limited Price Experimentation

Cheung, Wang Chi - 2017

In a dynamic pricing problem where the demand function is not known a priori, price experimentation can be used as a demand learning tool. Existing literature usually assumes no constraint on price changes, but in practice sellers often face business constraints that prevent them from conducting...

Persistent link: https://www.econbiz.de/10012973088

Meta Dynamic Pricing : Transfer Learning Across Experiments

Bastani, Hamsa - 2020

We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation where the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then...

Persistent link: https://www.econbiz.de/10012850146

Offline Pricing and Demand Learning with Censored Data

Bu, Jinzhi - 2020

We study a single product pricing problem with demand censoring in an offline data-driven setting. In this problem, a retailer is given a finite amount of inventory, and faces a random demand that is price sensitive in a linear fashion with unknown price sensitivity and base demand distribution....

Persistent link: https://www.econbiz.de/10012832090

Multimodal dynamic pricing

Wang, Yining; Chen, Boxiao; Simchi-Levi, David - In: Management science : journal of the Institute for … 67 (2021) 10, pp. 6136-6152

Persistent link: https://www.econbiz.de/10012665750