A clustering-based discretization for supervised learning
We address the problem of discretization of continuous variables for machine learning classification algorithms. Existing procedures do not use interdependence between the variables towards this goal. Our proposed method uses clustering to exploit such interdependence. Numerical results show that this improves the classification performance in almost all cases. Even if an existing algorithm can successfully operate with continuous variables, better performance is obtained if the variables are first discretized. An additional advantage of discretization is that it reduces the overall computation time.
Year of publication: |
2010
|
---|---|
Authors: | Gupta, Ankit ; Mehrotra, Kishan G. ; Mohan, Chilukuri |
Published in: |
Statistics & Probability Letters. - Elsevier, ISSN 0167-7152. - Vol. 80.2010, 9-10, p. 816-824
|
Publisher: |
Elsevier |
Subject: | Discretization Clustering Binning Supervised learning |
Saved in:
Saved in favorites
Similar items by person
-
Squeezing the last drop: Cluster-based classification algorithm
Mehrotra, Kishan G., (2007)
-
Nonparametric tests for ordered alternatives in the bivariate case
Johnson, Richard A., (1972)
-
Pedestrian flow characteristics studies : a review
Gupta, Ankit, (2015)
- More ...