Automatic Generation of Association Thesaurus Based on Domain-Specific Text Collection
The given work examines distributive approach for automatic generation of the associative thesauri of a definite domain. Distributive approach is based on assumption that presence of associative link among terms of the domain is defined by the statistics of their co-occurence in thematically related discources. The advantage of distributive approach is defined by the fact that it uses raw basic material (for example collection of documents of the domain) and it does not use additional knowledge about the domain. Distributive approach is supported only by mathematical apparatus of statistics and does not take into account neither lexical nor semantic information, that is why this approach let cover extensive lexical space of terms. However it leads to the main shortcoming of the approach, i.e. it produces excessive amount of “unnecessary” links among words which are less informative from utilitarian point of view. For solving set problems in the given work it is suggested to use special approach represented by combination of methods of distributive statistics, latent semantic analysis and graph theory.
Year of publication: |
2014-06
|
---|---|
Authors: | Nugumanova, Aliya ; Issabaeva, Dinara ; Baiburin, Yerzhan |
Institutions: | International Institute of Social and Economic Sciences |
Subject: | LSA | thesaurus | chi-square test | graph |
Saved in:
Series: | Proceedings of International Academic Conferences. - ISSN 2336-5617. |
---|---|
Type of publication: | Book / Working Paper |
Notes: | Published in Proceedings of the Proceedings of the 10th International Academic Conference, Jun 2014, pages 529-538 Number 0201861 10 pages |
Classification: | C80 - Data Collection and Data Estimation Methodology; Computer Programs. General |
Source: |
Persistent link: https://www.econbiz.de/10011210315
Saved in favorites
Similar items by subject
-
Sauer, Robert M., (2007)
-
Data Mining und XML. Modularisierung und Automatisierung von Verarbeitungsschritten
Wissuwa, Stefan, (2003)
-
Löbbert, Chris, (2008)
- More ...
Similar items by person