Summary: This paper explores the implications of possible bias cancellation using Rubin-style matching methods with complete and incomplete data. After reviewing the na?ve causal estimator and the approaches of Heckman and Rubin to the causal estimation problem, we show how missing data can complicate the estimation of average causal effects in different ways, depending upon the nature of the missing mechanism. While – contrary to published assertions in the literature – bias cancellation does not generally occur when the multivariate distribution of the errors is symmetric, bias cancellation has been observed to occur for the case where selection into training is the treatment variable, and earnings is the outcome variable. A substantive rationale for bias cancellation is offered, which conceptualizes bias cancellation as the result of a mixture process based on two distinct individual-level decision-making models. While the general properties are unknown, the existence of bias cancellation appears to reduce the average bias in both OLS and matching methods relative to the symmetric distribution case. Analysis of simulated data under a set of difference scenarios suggests that matching methods do better than OLS in reducing that portion of bias that comes purely from the error distribution (i.e., from "selection on unobservables"). This advantage is often found also for the incomplete data case. Matching appears to offer no advantage over OLS in reducing the impact of bias due purely to selection on unobservable variables when the error variables are generated by standard multivariate normal distributions, which lack the bias-cancellation property.
Questions? LIVE CHAT