New tools for evaluating the results of cluster analyses
Clustering methods are designed for finding groups in data, i.e., for grouping similar objects (variables or observations) into the same cluster and dissimilar objects into separate clusters. Although the main idea is rather simple, carrying out a cluster analysis remains a challenging task. The number of different clustering methods is huge and clustering includes many choices, such as the decision between basic approaches (e.g., hierarchical and partitioning methods), the choice of a dissimilarity or similarity measure, the selection of a particular linkage method when performing a hierarchical agglomerative cluster analysis, the choice of an initial partition when carrying out a partitioning cluster analysis, and the determination of the appropriate number of clusters. Each of these decisions can affect the classification results. Apart from two commands for determining the number of clusters (cluster stop, cluster dendrogram) Stata has no built-in tools that allow examination of clustering results. We therefore developed some simple tools that provide further evaluation criteria: * programs assisting in determining the number of clusters (Mojena’s stopping rules for hierarchical clustering techniques, PRE coefficient, F-Max statistic and Beale’s F values for a partitioning cluster analysis), * a program for testing the stability of classifications produced by different cluster analyses (Rand index), and * a program that computes ETA2 to assess how well the clustering variables separate the clusters. The presentation will compare these programs with other cluster-analysis tools (agglomeration schedule, scree diagram).
Year of publication: |
2006-05-24
|
---|---|
Authors: | Schaeper, Hildegard |
Institutions: | Stata User Group |
Saved in:
Saved in favorites
Similar items by person
-
Schaeper, Hildegard, (1999)
-
Schaeper, Hildegard, (1999)
-
Übergang von der Schule zur Hochschule : Entscheidungsprozess, Erwartungen an Hochschule und Studium
Schaeper, Hildegard, (1984)
- More ...