Online Learning for Constrained Assortment Optimization under Markov Chain Choice Model
We study a dynamic assortment selection problem where arriving customers make purchase decisions among offered products from a universe of $N$ products under a Markov-chain-based choice (MCBC) model. The retailer observes only the assortment and the customer's single choice per period. Given limited display capacity, resource constraints, and no a priori knowledge of problem parameters, the retailer's objective is to sequentially learn the choice model and optimize cumulative revenues over a selling horizon of length $T$. We develop an explore-then-exploit learning algorithm that balances the trade-off between exploration and exploitation. The algorithm can simultaneously estimate the arrival and transition probabilities in the MCBC model by solving linear equations and determining the near-optimal assortment based on these estimates. Furthermore, compared to existing heuristic estimation methods that suffer from inconsistency and a large computational burden, our consistent estimators enjoy superior computational times
Year of publication: |
2022
|
---|---|
Authors: | Li, Shukai ; Luo, Qi ; Huang, Zhiyuan ; Shi, Cong |
Publisher: |
[S.l.] : SSRN |
Subject: | Markov-Kette | Markov chain | Theorie | Theory | Mathematische Optimierung | Mathematical programming | Konsumentenverhalten | Consumer behaviour | E-Learning | E-learning |
Saved in:
freely available
Saved in favorites
Similar items by subject
-
Control of online-appointment systems when the booking status signals quality of service
Kaluza, Isabel, (2024)
-
Single-leg choice-based revenue management : a robust optimisation approach
Sierag, Dirk, (2016)
-
A MILP approach to the optimization of banner display strategy to tackle banner blindness
Zouharová, Martina, (2016)
- More ...
Similar items by person