James-Stein shrinkage to improve k-means cluster analysis
We study a general algorithm to improve the accuracy in cluster analysis that employs the James-Stein shrinkage effect in k-means clustering. We shrink the centroids of clusters toward the overall mean of all data using a James-Stein-type adjustment, and then the James-Stein shrinkage estimators act as the new centroids in the next clustering iteration until convergence. We compare the shrinkage results to the traditional k-means method. A Monte Carlo simulation shows that the magnitude of the improvement depends on the within-cluster variance and especially on the effective dimension of the covariance matrix. Using the Rand index, we demonstrate that accuracy increases significantly in simulated data and in a real data example.
Year of publication: |
2010
|
---|---|
Authors: | Gao, Jinxin ; Hitchcock, David B. |
Published in: |
Computational Statistics & Data Analysis. - Elsevier, ISSN 0167-9473. - Vol. 54.2010, 9, p. 2113-2127
|
Publisher: |
Elsevier |
Keywords: | Centroids Effective dimension k-means clustering Stein estimation |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Smoothing dissimilarities to cluster binary data
Hitchcock, David B., (2008)
-
Improved Estimation of Dissimilarities by Presmoothing Functional Data
Hitchcock, David B., (2006)
-
Improved estimation of dissimilarities by presmoothing functional data
Hitchcock, David B., (2006)
- More ...