Case-control study power and sample size calculations using Stata
We use Stata's npnchi2 and nchi2 functions to calculated power and required sample size for case-control studies. Following the method described by Self et al (1992) a large exemplary data set with expected risk factor frequencies among cases and controls under any alternative hypothesis is created. The likelihood ratio test statistic for the hypothesis of interest is distributed as a non-central chi-squared statistic under the alternative hypothesis, and the likelihood ratio test statistic from the analysis of the exemplary data set is an approximation to the non-centrality parameter for this distribution. We apply these methods to power and sample-size calculations for case-control studies of gene-gene and gene-environment interactions. Because of the low power of case-control studies to detect interactions, a wide range of different strategies have been proposed. Required sample size depends on several design parameters and so the simplicity of these methods means that the efficiency of many designs can be compared over different ranges, a valuable tool at the planning stage of a study. Results are presented for population based, family and matching schemes that have been proposed to improve power, and comparisons of the power of different designs are made. Stata programs are available for these comparisons.