Sample size and power analysis for sparse signal recovery in genome-wide association studies
Genome-wide association studies have successfully identified hundreds of novel genetic variants associated with many complex human diseases. However, there is a lack of rigorous work on evaluating the statistical power for identifying these variants. In this paper, we consider sparse signal identification in genome-wide association studies and present two analytical frameworks for detailed analysis of the statistical power for detecting and identifying the disease-associated variants. We present an explicit sample size formula for achieving a given false non-discovery rate while controlling the false discovery rate based on an optimal procedure. Sparse genetic variant recovery is also considered and a boundary condition is established in terms of sparsity and signal strength for almost exact recovery of both disease-associated variants and nondisease-associated variants. A data-adaptive procedure is proposed to achieve this bound. The analytical results are illustrated with a genome-wide association study of neuroblastoma. Copyright 2011, Oxford University Press.
Year of publication: |
2011
|
---|---|
Authors: | Xie, Jichun ; Cai, T. Tony ; Li, Hongzhe |
Published in: |
Biometrika. - Biometrika Trust, ISSN 0006-3444. - Vol. 98.2011, 2, p. 273-290
|
Publisher: |
Biometrika Trust |
Saved in:
Saved in favorites
Similar items by person
-
Covariate-adjusted precision matrix estimation with an application in genetical genomics
Cai, T. Tony, (2013)
-
Chen, Jun, (2011)
-
Optimal sparse segment identification with application in copy number variation analysis
Jeng, X. Jessie, (2010)
- More ...