Variable selection in clustering via Dirichlet process mixture models
The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure. We update the variable selection index using a Metropolis algorithm and obtain inference on the cluster structure via a split-merge Markov chain Monte Carlo technique. We explore the performance of the methodology on simulated data and illustrate an application with a DNA microarray study. Copyright 2006, Oxford University Press.
| Year of publication: |
2006
|
|---|---|
| Authors: | Kim, Sinae ; Tadesse, Mahlet G. ; Vannucci, Marina |
| Published in: |
Biometrika. - Biometrika Trust, ISSN 0006-3444. - Vol. 93.2006, 4, p. 877-893
|
| Publisher: |
Biometrika Trust |
Saved in:
Saved in favorites
Similar items by person
-
Bayesian Variable Selection in Clustering High-Dimensional Data
Tadesse, Mahlet G., (2005)
-
Theory and Methods - Bayesian Variable Selection in Clustering High-Dimensional Data
Tadesse, Mahlet G., (2005)
-
Bayesian variable selection in clustering high-dimensional data
Tadesse, Mahlet G., (2005)
- More ...