Nash Q-Learning Agents in Hotelling’s Model : Reestablishing Equilibrium
This paper examines the behavior of adaptive agents in a stochastic dynamic version of the Hotelling's location model. We conduct an innovative agent-based simulation under the Hotelling's setting with two agents who use the Nash Q-learning mechanism for adaptation. This allows us to explore what alternations of results this technique brings in comparison to the original analytic solution of the famous static game-theoretic model with strong assumptions imposed on players. We discover that under Nash Q-learning and quadratic consumer cost function, agents with high enough valuation of future profits learn behavior similar to aggressive market strategy, where both agents make similar products and lead a price war in order to eliminate their opponent from the market. This behavior closely resembles the Principle of Minimum Differentiation from Hotelling's original paper with linear consumer costs, although quadratic consumer cost functions are used in the simulation which would otherwise result in the maximum differentiation of the production in the original model. Our results thus suggest that the Principle of Minimum Differentiation can be justified based on repeated interaction of the agents and long-run optimization