Simulation of "forward-backward" multiple-imputation technique in longitudinal clinical dataset
Most standard missing-data techniques have been designed for cross-sectional data. A "forward-backward" multiple-imputation algorithm has been developed to impute missing values in longitudinal data (Nevalainen, Kenward, and Virtanen, 2009, Statistics in Medicine 28: 36577-3669) This technique will be applied to The Health Improvement Network (THIN), a longitudinal primary-care database to impute variables associated with incidence of cardiovascular disease (CVD). A sample of 483 patients was extracted from THIN to test the performance of the algorithm before it was applied to the whole dataset. This dataset included individuals with information available on age, sex, deprivation quintile, height, weight, systolic blood pressure, and total serum cholesterol for each age from 65 to 69 years. CVD was identified if the patient was diagnosed with one of a predefined list of conditions at any of these ages. They were then considered to have CVD at each subsequent age. In this sample, measurements of weight, systolic blood pressure, and cholesterol were replaced with missing values such that the probability that data are missing decreases as age increases; i.e., the data are missing at random and the overall percentage of missing data is equivalent to that in THIN. We then applied the forward-backward algorithm, which imputes values at each time point by using measurements before and after the one of interest and updates values sequentially. Ten complete datasets were created. A Poisson regression was performed using data in each dataset, and estimates were combined using Rubin's rules. These steps were repeated 200 times and the coefficients were averaged. I will explain in more detail how the forward-backward algorithm works and also will demonstrate the results following multiple imputation using this algorithm. I will compare these results with the analysis before data were replaced with missing values and a complete case analysis to assess the performance of the algorithm.
Year of publication: |
2010-09-17
|
---|---|
Authors: | Welch, Catherine ; Petersen, Irene ; Carpenter, James |
Institutions: | Stata User Group |
Saved in:
Saved in favorites
Similar items by person
-
Petersen, Irene, (2011)
-
Multiple imputation of missing data in longitudinal health records
Petersen, Irene, (2013)
-
Welch, Catherine, (2014)
- More ...