A Calibrated Multiclass Extension of AdaBoost
AdaBoost is a popular and successful data mining technique for binary classification. However, there is no universally agreed upon extension of the method for problems with more than two classes. Most multiclass generalizations simply reduce the problem to a series of binary classification problems. The statistical interpretation of AdaBoost is that it operates through loss-based estimation: by using an exponential loss function as a surrogate for misclassification loss, it sequentially minimizes empirical risk through fitting a base classifier to iteratively reweighted training data. While there are several extensions using loss-based estimation with multiclass base classifiers, these use multiclass versions of the exponential loss that are not classification calibrated: unless restrictions are placed on conditional class probabilities, it becomes possible to have optimal surrogate risk but poor misclassification risk. In this work, we introduce a new AdaBoost extension called AdaBoost.
Year of publication: |
2011
|
---|---|
Authors: | Rubin Daniel B. |
Published in: |
Statistical Applications in Genetics and Molecular Biology. - De Gruyter, ISSN 1544-6115. - Vol. 10.2011, 1, p. 1-24
|
Publisher: |
De Gruyter |
Saved in:
Saved in favorites
Similar items by person
-
Statistical Issues and Limitations in Personalized Medicine Research with Clinical Trials
Rubin Daniel B., (2012)
-
Rubin Daniel B., (2008)
- More ...