Thompson Sampling: Endogenously Random Behavior in Games and Markets
Economists tend to assume that agents maximize their expected utility. However, many different experiments have questioned expected utility maximization by showing that human behavior can be characterized as random. This paper proposes Thompson Sampling as a theory of human behavior across very different situations of dynamic strategic interaction in economics. Thompson Sampling means that agents, having limited information about their environments, update their subjective belief distributions in a Bayesian way and subsequently make a random draw from the posterior. Conditional on that random draw, agents optimize. While Bayesian reasoning has often been shown to be at odds with agents' behavior even in simple environments, using data on experimental games, this paper shows that Bayesian sampling as in Thompson's proposal is a better description of agents' decision-making than commonly used theories of decision-making in economics such as Nash equilibrium, standard Bayesian learning and quantal response equilibrium (QRE) - above all in complex environments with many possible actions.