Kernel Factory: An Ensemble of Kernel Machines
We propose an ensemble method for kernel machines. The training data is randomly split into a number of mutually exclusive partitions defined by a row and column parameter. Each partition forms an input space and is transformed by a kernel function into a kernel matrix K. Subsequently, each K is used as training data for a base binary classifier (Random Forest). This results in a number of predictions equal to the number of partitions. A weighted average combines the predictions into one final prediction. To optimize the weights, a genetic algorithm is used. This approach has the advantage of simultaneously promoting (1) diversity, (2) accuracy, and (3) computational speed. (1) Diversity is fostered because the individual K’s are based on a subset of features and observations, (2) accuracy is sought by optimizing the weights with the genetic algorithm, and (3) computational speed is obtained because the computation of each K can be parallelized. Using five times two-fold cross validation we benchmark the classification performance of Kernel Factory against Random Forest and Kernel-Induced Random Forest (KIRF). We find that Kernel Factory has significantly better performance than Kernel-Induced Random Forest. When the right kernel is specified Kernel Factory is also significantly better than Random Forest. In addition, an open-source Rsoftware package of the algorithm (kernelFactory) is available from CRAN.
Year of publication: |
2012-12
|
---|---|
Authors: | BALLINGS, M. ; POEL, D. VAN DEN |
Institutions: | Faculteit Economie en Bedrijfskunde, Universiteit Gent |
Saved in:
Saved in favorites
Similar items by person
-
The Relevant Length of Customer Event History for Churn Prediction: How long is long enough?
BALLINGS, M., (2012)
-
Evaluating the Added Value of Pictorial Data for Customer Churn Prediction
BALLINGS, M., (2013)
-
Deep Habits in Consumption: A Spatial Panel Analysis Using Scanner Data
VERHELST, B., (2012)
- More ...