Showing 1 - 10 of 13
Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and...
Persistent link: https://www.econbiz.de/10005101463
The traveling salesperson (or, salesman) problem (TSP) is a well known and important combinatorial optimization problem. The goal is to find the shortest tour that visits each city in a given list exactly once and then returns to the starting city. Despite this simple problem statement, solving...
Persistent link: https://www.econbiz.de/10005101531
Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the...
Persistent link: https://www.econbiz.de/10008460745
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. The R package clue provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data...
Persistent link: https://www.econbiz.de/10005106015
Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and...
Persistent link: https://www.econbiz.de/10005106037
kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 ob ject model and provides a framework for creating and using kernel-based algorithms. The package contains dot product primitives (kernels), implementations of support vector...
Persistent link: https://www.econbiz.de/10005106065
Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R...
Persistent link: https://www.econbiz.de/10009245485
This paper reviews tests for structural change in linear regression models from the generalized fluctuation test framework as well as from the F test (Chow test) framework. It introduces a unified approach for implementing these tests and presents how these ideas have been realized in an R...
Persistent link: https://www.econbiz.de/10005113326
This paper describes the "strucplot" framework for the visualization of multi-way contingency tables. Strucplot displays include hierarchical conditional plots such as mosaic, association, and sieve plots, and can be combined into more complex, specialized plots for visualizing conditional...
Persistent link: https://www.econbiz.de/10005113350
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the <strong>tm</strong> package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical...
Persistent link: https://www.econbiz.de/10008460710