Showing 1 - 10 of 12
The minimum number of misclassifications achievable with affine hyper_ planes on a given set of labeled points is a key quantity in both statistics and computational learning theory. However, determining this quantity exactly is essentially NP_hard_ cf_ Höfgen, Simon and van Horn (1995.) Hence,...
Persistent link: https://www.econbiz.de/10010316484
Cronbach’s alpha is a popular method to measure reliability, e.g. in quantifying the reliability of a score to summarize the information of several items in questionnaires. The alpha coefficient is known to be non-robust. We study the behavior of this coefficient in different settings to...
Persistent link: https://www.econbiz.de/10010316559
Persistent link: https://www.econbiz.de/10010316591
In this paper we show that the recent notion of regression depth can be used as a data-analytic tool to measure the amount of separation between successes and failures in the binary response framework. Extending this algorithm allows us to compute the overlap in data sets which are commonly...
Persistent link: https://www.econbiz.de/10010316690
Data sets from car insurance companies often have a high-dimensional complex dependency structure. The use of classical statistical methods such as generalized linear models or Tweedie?s compound Poisson model can yield problems in this case. Christmann (2004) proposed a general approach to...
Persistent link: https://www.econbiz.de/10010296633
We investigate properties of kernel based regression (KBR) methods which are inspired by the convex risk minimization method of support vector machines. We first describe the relation between the used loss function of the KBR method and the tail of the response variable Y . We then establish a...
Persistent link: https://www.econbiz.de/10010296663
Many robust statistical procedures have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets. Secondly, robust confidence intervals for the estimated parameters or robust predictions according to the fitted models are often unknown. Here, we...
Persistent link: https://www.econbiz.de/10010296669
The optimization of the hyper-parameters of a statistical procedure or machine learning task is a crucial step for obtaining a minimal error. Unfortunately, the optimization of hyper-parameters usually requires many runs of the procedure and hence is very costly. A more detailed knowledge of the...
Persistent link: https://www.econbiz.de/10010296699
Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and non-parametric confidence intervals for the...
Persistent link: https://www.econbiz.de/10010296722
The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The results of the strategy are a basis to develop insurance tariffs. The strategy is applied...
Persistent link: https://www.econbiz.de/10010306241