Using Predicted Outcome Stratified Sampling to Reduce the Variability in Predictive Performance of a One-Shot Train-and-Test Split for Individual Customer Predictions

Since it is generally recognized that models evaluated on the data that was used for constructing them are overly optimistic, in predictive modeling practice, the assessment of a model’s predictive performance frequently relies on a one-shot train-and-test split between observations used for estimating a model, and those used for validating it. Previous research has indicated the usefulness of stratified sampling for reducing the variation in predictive performance in a linear regression application. In this paper, we validate the previous findings on six real-life European predictive modeling applications for marketing and credit scoring using a dichotomous outcome variable. We find confirmation for the reduction in variability using a procedure we describe as predicted outcome stratified sampling in a logistic regression model, and we find that the gain in variation reduction is – also in large data sets – almost always significant, and in certain applications markedly high.

MoreLess

Year of publication:	2006-01
Authors:	VERSTRAETEN, G. ; POEL, D. VAN DEN
Institutions:	Faculteit Economie en Bedrijfskunde, Universiteit Gent

More details

Extent:	application/pdf
Series:	Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium.
Type of publication:	Book / Working Paper
Language:	English
Notes:	10 pages
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10004982846