A probabilistic explanation of a natural phenomenon
Regression is the procedure that attempts to relate a$p$-dimensional vector of predictors $Xb$ with a response variable$Y$. Frequently, we deal with regression problems that have a largeamount of predictors. In those cases, we try to reduce thedimension of our predictor vector. The reason we are trying toreduce the dimension, is the necessity to find the predictors thatwill affect our response the most. One of the most widely usedmethods is the Principal Components Analysis. With this analysis, Itry to find the first few $d$ ($ll p$) principal components, thatare generally believed to better describe the relationship betweenpredictors $Xb$ and response $Y$.This procedure however has not been appropriately justified. Inpractice, it often occurs that the first few principal componentsare more highly correlated with the response variable, and betterdescribe the relationship between the predictors and the responsevariable than the other principal components. However, there seemsno logical reason for this tendency, and there are cases - albeitless often - where the first few principal components have weakercorrelation with the response. There is a long standing debate onthis issue among statisticians, and, todate, it has not beenadequately resolved.In this thesis I ask, and attempt to answer, the followingquestions: Is there a tendency for the first few principalcomponents of the predictor to be more strongly related with theresponse? If so, what is the reason behind this tendency? And howstrong is this tendency?
| Year of publication: |
2008-05-18
|
|---|---|
| Authors: | Artemiou, Andreas A |
| Other Persons: | Bing Li (contributor) |
| Publisher: |
Penn State |
| Subject: | Statistics |
Saved in:
Saved in favorites
Similar items by subject
-
Statistical pocket book of Hungary
(1958)
-
Statistisches Taschenbuch Ungarns
(1991)
-
Berauer, Wilfried, (1953)
- More ...
Similar items by person