Alternative imputation techniques for complex metric variables
This paper deals with imputation techniques and strategies. Usually, imputation truly commences after the first data editing, but many preceding operations are needed before that. In this editing step, the missing or deficient items are to be recognized and coded, and then it is decided which of these, if any, should be substituted by imputing. There are a number of imputation methods and their specifications. Consequently, it is not clear what method finally should be chosen, especially when an imputation method may be best in one respect, and another method in the other. In this paper, we consider these questions through the following four imputation methods: (i) random hot decking, (ii) logistic regression imputation, (iii) linear regression imputation, and (iv) regression-based nearest neighbour hot decking. The last two methods are applied with the two different specifications. The two metric variables have been used in empirical tests. The first is very complex, but the second is more ordinary, and thus easier to handle. The empirical examples are based on simulations, which clearly show the biases of the various methods and their specifications. In general, it seems that method (iv) is recommendable although the results from it are not perfect either.
Year of publication: |
2003
|
---|---|
Authors: | Laaksonen, Seppo |
Published in: |
Journal of Applied Statistics. - Taylor & Francis Journals, ISSN 0266-4763. - Vol. 30.2003, 9, p. 1009-1020
|
Publisher: |
Taylor & Francis Journals |
Saved in:
Saved in favorites
Similar items by person
-
Laaksonen, Seppo, (1991)
-
Teollisuustyöntekijöiden asema tulo- ja työaikatilastojen mukaan 1980-luvun alussa
Laaksonen, Seppo, (1984)
-
Katovirheen korjaus kotitalousaineistossa
Laaksonen, Seppo, (1988)
- More ...