Evaluation of clustering algorithms for word sense disambiguation
Word sense disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly preparation of training data. This work focuses on evaluation of a few selected clustering algorithms in the task of word sense disambiguation. We used five datasets for two languages (English and Polish). Five clustering algorithms (k-means, k-medoids, hierarchical agglomerative clustering, hierarchical divisive clustering, graph-partitioning-based clustering) and two weighting schemes were tested. The best parameters of the algorithms were chosen using 5 × 2 cross validation. BCubed measure was employed for evaluation of clustering. We conclude that with these settings agglomerative hierarchical clustering achieves best results for all the datasets.
Year of publication: |
2012
|
---|---|
Authors: | Broda, Bartosz ; Mazur, Wojciech |
Published in: |
International Journal of Data Analysis Techniques and Strategies. - Inderscience Enterprises Ltd, ISSN 1755-8050. - Vol. 4.2012, 3, p. 219-236
|
Publisher: |
Inderscience Enterprises Ltd |
Subject: | clustering algorithms | word sense disambiguation | WSD | BCubed | senseval | bag of words | English | Polish |
Saved in:
Online Resource
Saved in favorites
Similar items by subject
-
Word sense disambiguation in Tamil using Indo-WordNet and cross-language semantic similarity
Karuppaiah, Deepa, (2021)
-
Word Sense Disambiguation using Aggregated Similarity based on WordNet Graph Representation
ZURINI, Mădălina, (2013)
-
Word Sense Based Hindi-Tamil Statistical Machine Translation
Kumar K., Vimal, (2018)
- More ...