Showing 1 - 2 of 2
We investigate learning in a setting where each period a population has to choose between two actions and the payoff of each action is unknown by the players. The population learns according to reinforcement and the environment is non-stationary, meaning that there is correlation between the...
Persistent link: https://www.econbiz.de/10005744301
In this paper we study learning procedures when counterfactuals (payo s of not-chosen actions) are not observed. The decision maker reasons in two steps: First, she updates her propensities for each action after every payo experience, where propensity is de ned as how much she prefers each...
Persistent link: https://www.econbiz.de/10008485536