On multivariate binary data clustering and feature weighting
This paper presents an approach that partitions data sets of unlabeled binary vectors without a priori information about the number of clusters or the saliency of the features. The unsupervised binary feature selection problem is approached using finite mixture models of multivariate Bernoulli distributions. Using stochastic complexity, the proposed model determines simultaneously the number of clusters in a given data set composed of binary vectors and the saliency of the features used. We conduct different applications involving real data, document classification and images categorization to show the merits of the proposed approach.
Year of publication: |
2010
|
---|---|
Authors: | Bouguila, Nizar |
Published in: |
Computational Statistics & Data Analysis. - Elsevier, ISSN 0167-9473. - Vol. 54.2010, 1, p. 120-134
|
Publisher: |
Elsevier |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Infinite Dirichlet mixture models learning via expectation propagation
Fan, Wentao, (2013)
-
Bouguila, Nizar, (2010)
- More ...