Reinforcement Learning in a Prisoner's Dilemma

I fully characterize the outcomes of a wide class of model-free reinforcement learning algorithms, such as Q-learning, in a prisoner’s dilemma. The behavior is studied in the limit as players explore their options sufficiently and eventually stop experimenting.Whether the players learn to cooperate or defect can be determined in a closed form from the relationship between the learning rate and the payoffs of the game.The results generalize to asymmetric learners and many experimentation rules with implications for the issue of algorithmic collusion

MoreLess

Year of publication:	2022
Authors:	Dolgopolov, Arthur
Publisher:	[S.l.] : SSRN
Subject:	Gefangenendilemma \| Prisoner's dilemma \| Theorie \| Theory \| Lernprozess \| Learning process \| Lernen \| Learning

Extent:	1 Online-Ressource (37 p)
Type of publication:	Book / Working Paper
Language:	English
Other identifiers:	10.2139/ssrn.4240842 [DOI]
Source:	ECONIS - Online Catalogue of the ZBW

Persistent link: https://www.econbiz.de/10014244305