Using Stata for a memory-saving fixed-effects estimation of the three-way error-components model
Researchers trying to estimate tens or hundreds of thousands of fixed effects for two or more groups (workers and firms; pupils, teachers and schools; etc.) in datasets with high numbers of observations are often limited by the size of computer memory available. Such a model is commonly estimated by sweeping out one of the effects by the fixed-effects transformation (time-demeaning) and by including the remaining effects as dummy variables. If K is the number of fixed effects to be included as dummy variables, and N is the number of observations, then the design matrix is of dimension N x K (neglecting any remaining right-hand-side regressors). The time-demeaned dummies have to be stored as “float” variables consuming 8 bytes per cell in Stata. For example, with 2 million observations (N) and 10 thousand fixed effects (K), the memory requirement would be 160 gigabytes. This paper describes how the memory requirement can be reduced to store only a K x K matrix, which in the given example reduces the memory requirement to below 1 gigabyte. The paper also describes the Stata program felsdvreg.ado, which implements the method in Mata. Besides implementing the memory-saving estimation method, the program also takes care of checking the identification of the effects and provides useful summary statistics.
Year of publication: |
2008-07-03
|
---|---|
Authors: | Cornelissen, Thomas |
Institutions: | Stata User Group |
Saved in:
Saved in favorites
Similar items by person
-
Spatial Spillovers of Conflict in Somalia
Alfano, Marco, (2022)
-
Early School Exposure, Test Scores, and Noncognitive Outcomes
Cornelissen, Thomas, (2019)
-
Who benefits from universal child care? Estimating marginal returns to early child care attendance
Cornelissen, Thomas, (2018)
- More ...