This paper revisits a recent study by Posen and Levinthal (2012) on the exploration/exploitation tradeoff for a multi-armed bandit problem, where the reward probabilities undergo random shocks. We show that their analysis suffers two shortcomings: it assumes that learning is based on stale...