Modeling Read Counts for CNV Detection in Exome Sequencing Data

Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.

MoreLess

Year of publication:	2011
Authors:	Love Michael I. ; Alena, Myšičková ; Ruping, Sun ; Vera, Kalscheuer ; Martin, Vingron ; Haas Stefan A.
Published in:	Statistical Applications in Genetics and Molecular Biology. - De Gruyter, ISSN 1544-6115. - Vol. 10.2011, 1, p. 1-30
Publisher:	De Gruyter

More details

Extent:	text/html
Type of publication:	Article
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10009365923