Loss functions for binary classification and class probability estimation
What are the natural loss functions for binary class probability estimation? This question has a simple answer: socalled "proper scoring rules". These loss functions, known from subjective probability, measure the discrepancy between true probabilities and estimates thereof. They comprise all commonly used loss functions: lob loss, squared error loss, boosting loss (which we derive from boosting's exponential loss), and costweighted misclassification losses. We also introduce a larger class of possibly uncalibrated loss functions that can be calibrated with a link function. An example is exponential loss, which is related to boosting. Proper scoring rules are fully characterized by weight functions ω(η) on class probabilities η = P [Y = 1]. These weight functions give immediate practical insight into loss functions: high mass of ω(η) points to the class probabilities η where the proper scoring rule strives for greatest accuracy. For example, both logloss and boosting loss have poles near zero and one, hence rely on extreme probabilities. We show that the freedom of choice among proper scoring rules can be exploited ploited when the two types of misclassification have different costs: one can choose proper scoring rules that focus on the cost c of class 0 misclassification by concentrating ω(η) near c . We also show that costweighting uncalibrated loss functions can achieve tailoring. "Tailoring" is often beneficial for classical linear models, whereas nonparametric boosting models show fewer benefits. We illustrate "tailoring" with artificial and real datasets both for linear models and for nonparametric models based on trees, and compare it with traditional linear logistic regression and one recent version of boosting, called "LogitBoost".
Year of publication: 
20050101


Authors:  Shen, Yi 
Publisher: 
ScholarlyCommons 
Subject:  Statistics 
Saved in favorites
Similar items by subject

(1988)

(1987)

(1987)
 More ...
Similar items by person

Shên, I., (1925)

Understanding content voting based on social foraging theory
Xu, Lingling, (2017)

Pei, Zhuan, (2016)
 More ...