On multivariate binary data clustering and feature weighting

This paper presents an approach that partitions data sets of unlabeled binary vectors without a priori information about the number of clusters or the saliency of the features. The unsupervised binary feature selection problem is approached using finite mixture models of multivariate Bernoulli distributions. Using stochastic complexity, the proposed model determines simultaneously the number of clusters in a given data set composed of binary vectors and the saliency of the features used. We conduct different applications involving real data, document classification and images categorization to show the merits of the proposed approach.

MoreLess

Year of publication:	2010
Authors:	Bouguila, Nizar
Published in:	Computational Statistics & Data Analysis. - Elsevier, ISSN 0167-9473. - Vol. 54.2010, 1, p. 120-134
Publisher:	Elsevier

More details

Type of publication:	Article
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10008484560