Model selection in high dimensions: a quadratic-risk-based approach
We propose a general class of risk measures which can be used for data-based evaluation of parametric models. The loss function is defined as the generalized quadratic distance between the true density and the model proposed. These distances are characterized by a simple quadratic form structure that is adaptable through the choice of a non-negative definite kernel and a bandwidth parameter. Using asymptotic results for the quadratic distances we build a quick-to-compute approximation for the risk function. Its derivation is analogous to the Akaike information criterion but, unlike the Akaike information criterion, the quadratic risk is a global comparison tool. The method does not require resampling, which is a great advantage when point estimators are expensive to compute. The method is illustrated by using the problem of selecting the number of components in a mixture model, where it is shown that, by using an appropriate kernel, the method is computationally straightforward in arbitrarily high data dimensions. In this same context it is shown that the method has some clear advantages over the Akaike information criterion and Bayesian information criterion. Copyright 2008 Royal Statistical Society.
Year of publication: |
2008
|
---|---|
Authors: | Ray, Surajit ; Lindsay, Bruce G. |
Published in: |
Journal of the Royal Statistical Society Series B. - Royal Statistical Society - RSS, ISSN 1369-7412. - Vol. 70.2008, 1, p. 95-118
|
Publisher: |
Royal Statistical Society - RSS |
Saved in:
Saved in favorites
Similar items by person
-
Kernels, Degrees of Freedom, and Power Properties of Quadratic Distance Goodness-of-Fit Tests
Lindsay, Bruce G., (2014)
-
Distance-based Model-Selection with application to the Analysis of Gene Expression Data
Ray, Surajit, (2003)
-
Functional regression models for South African economic indicators : a growth curve perspective
Mangisa, Siphumlile, (2019)
- More ...