Showing 1 - 10 of 156
and non‐relevant documents. The reduction of information overload requires that IR systems provide the capability of … screening the most valuable documents out of the mass of potentially or marginally relevant documents. This paper introduces a … new concept‐based method to analyse the text characteristics of documents at varying relevance levels. The results of the …
Persistent link: https://www.econbiz.de/10014854521
This paper presents an approach for performing knowledge discovery in texts through qualitative and quantitative analyses of high‐level textual characteristics. Instead of applying mining techniques on attribute values, terms or keywords extracted from texts, the discovery process works over...
Persistent link: https://www.econbiz.de/10014854528
In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously...
Persistent link: https://www.econbiz.de/10014671410
Books include novels, dictionaries, telephone books, textbooks, anthologies, instruction manuals, proceedings of meetings and directories. The phrase “electronic books” has been applied to some types of CD‐ROM systems, palm‐top CD players, on‐demand text, electronic document systems of...
Persistent link: https://www.econbiz.de/10014674028
Purpose – The purpose of this paper is to discuss how some companies are scanning books to make the text fully searchable. Design/methodology/approach – Discusses several cases of scanning books and fully searchable text. Findings – By making so much forgotten text searchable, accessible...
Persistent link: https://www.econbiz.de/10014686181
The term‐weighting function known as IDF was proposed in 1972, and has since been extremely widely used, usually as part of a TF*IDF function. It is often described as a heuristic, and many papers have been written (some based on Shannon's Information Theory) seeking to establish some...
Persistent link: https://www.econbiz.de/10014853078
Purpose – Scholarly monographs are a major information resource in the humanities. The purpose of this study is to evaluate the effectiveness of abstracting and indexing (A&I) databases and library catalogues (OPACs) for subject retrieval of these monographs. Design/methodology/approach – A...
Persistent link: https://www.econbiz.de/10014853217
Purpose – Automated sentence‐level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges performing the task is an issue of concern. In previous approaches, annotators were asked to identify...
Persistent link: https://www.econbiz.de/10014853324
Purpose – The purpose of this paper is first to provide a critical conceptual discussion of different use of the notion of text, especially in the case of expressions including words as well as images, second to consider the notion of document as an alternative to the notion of text, and...
Persistent link: https://www.econbiz.de/10014853455
Present and possible future developments in the techniques of document management are reviewed, the major ones being text retrieval and scanning and OCR. Acquisition, indexing and thesauri, publishing and dissemination and the document management industry are also addressed. The emerging...
Persistent link: https://www.econbiz.de/10014854516