Implementing Rubin's Alternative Multiple Imputation Method for Statistical Matching in Stata
This paper introduces two new commands, smpred and smmatch, that implement the statistical matching procedure proposed by Rubin (1986). The purpose of statistical matching in Rubin's procedure is to generate a single dataset from various datasets, where each dataset contains a specific variable of interest and all contain some variables in common. For two variables of interest that are not observed jointly for any unit, smpred generates the predicted values of each as a function of the other variable of interest and a set of control variables by assuming a partial correlation value (defined by the user) between the two variables of interest (while current programs assume that they are conditionally independent given the control variables). The smmatch command, on the other hand, matches observations of different datasets according to their predicted values (using a minimum distance criterion) conditional on a set of control variables, and it imputes the observed value of the match for the missing
C10 - Econometric and Statistical Methods: General. General ; C39 - Econometric Methods: Multiple/Simultaneous Equation Models. Other ; C53 - Forecasting and Other Model Applications