Modeling Read Counts for CNV Detection in Exome Sequencing Data
Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.
Year of publication: |
2011
|
---|---|
Authors: | Love Michael I. ; Alena, Myšičková ; Ruping, Sun ; Vera, Kalscheuer ; Martin, Vingron ; Haas Stefan A. |
Published in: |
Statistical Applications in Genetics and Molecular Biology. - De Gruyter, ISSN 1544-6115. - Vol. 10.2011, 1, p. 1-30
|
Publisher: |
De Gruyter |
Saved in:
Saved in favorites
Similar items by person
-
On the Power of Profiles for Transcription Factor Binding Site Detection
Sven, Rahmann, (2003)
-
Parameter estimation for the calibration and variance stabilization of microarray data
Wolfgang, Huber, (2003)
- More ...