High dimensional thresholded regression and shrinkage effect
type="main" xml:id="rssb12037-abs-0001"> <title type="main">Summary</title> <p>High dimensional sparse modelling via regularization provides a powerful tool for analysing large-scale data sets and obtaining meaningful interpretable models. The use of non-convex penalty functions shows advantage in selecting important features in high dimensions, but the global optimality of such methods still demands more understanding. We consider sparse regression with a hard thresholding penalty, which we show to give rise to thresholded regression. This approach is motivated by its close connection with <math xmlns="http://www.w3.org/1998/Math/MathML" altimg="urn:x-wiley:13697412:media:rssb12037:rssb12037-math-0001" wiley:location="equation/rssb12037-math-0001.gif"><msub><mi>L</mi><mn>0</m n></msub></math>-regularization, which can be unrealistic to implement in practice but of appealing sampling properties, and its computational advantage. Under some mild regularity conditions allowing possibly exponentially growing dimensionality, we establish the oracle inequalities of the resulting regularized estimator, as the global minimizer, under various prediction and variable selection losses, as well as the oracle risk inequalities of the hard thresholded estimator followed by further <math xmlns="http://www.w3.org/1998/Math/MathML" altimg="urn:x-wiley:13697412:media:rssb12037:rssb12037-math-0002" wiley:location="equation/rssb12037-math-0002.gif"><msub><mi>L</mi><mn>2</m n></msub></math>-regularization. The risk properties exhibit interesting shrinkage effects under both estimation and prediction losses. We identify the optimal choice of the ridge parameter, which is shown to have simultaneous advantages to both the <math xmlns="http://www.w3.org/1998/Math/MathML" altimg="urn:x-wiley:13697412:media:rssb12037:rssb12037-math-0003" wiley:location="equation/rssb12037-math-0003.gif"><msub><mi>L</mi><mn>2</m n></msub></math>-loss and the prediction loss. These new results and phenomena are evidenced by simulation and real data examples.
Year of publication: |
2014
|
---|---|
Authors: | Zheng, Zemin ; Fan, Yingying ; Lv, Jinchi |
Published in: |
Journal of the Royal Statistical Society Series B. - Royal Statistical Society - RSS, ISSN 1369-7412. - Vol. 76.2014, 3, p. 627-649
|
Publisher: |
Royal Statistical Society - RSS |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Nonsparse learning with latent variables
Zheng, Zemin, (2021)
-
Panning for gold : ‘model‐X’ knockoffs for high dimensional controlled variable selection
Candès, Emmanuel, (2018)
-
Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
Fan, Yingying, (2013)
- More ...