Optimal designs to select individuals for genotyping conditional on observed binary or survival outcomes and non-genetic covariates

In gene-disease association studies, the cost of genotyping makes it economical to use a two-stage design where only a subset of the cohort is genotyped. At the first-stage, the follow-up data along with some risk factors or non-genetic covariates are collected for the cohort and a subset of the cohort is then selected for genotyping at the second-stage. Intuitively the selection of the subset for the second-stage could be carried out efficiently if the data collected at the first-stage are utilized. The information contained in the conditional probability of the genotype given the first-stage data and the initial estimates of the parameters of interest is being maximized for efficient selection of the subset. The proposed selection method is illustrated using the logistic regression and Cox's proportional hazards model and algorithms that can find optimal or nearly optimal designs in discrete design space are presented. Simulation comparisons between D-optimal design, extreme selection and case-cohort design suggest that D-optimal design is the most efficient in terms of variance of estimated parameters, but extreme selection may be a good alternative for practical study design.

MoreLess

Year of publication:	2009
Authors:	Karvanen, Juha ; Kulathinal, Sangita ; Gasbarra, Dario
Published in:	Computational Statistics & Data Analysis. - Elsevier, ISSN 0167-9473. - Vol. 53.2009, 5, p. 1782-1793
Publisher:	Elsevier

More details

Type of publication:	Article
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10005130840