Simply Better: Using Regression Models to Estimate Major League Batting Averages
We consider the problem of estimating a Major League Baseball players batting average in the second half of a season based on his performance in the first half. We fit two linear regression models to players averages from each half of the 2004 season, use these models to predict batting averages in the latter half of 2005 and compare the results to those achieved by three Bayesian estimators considered by Brown (2008). The linear models consistently outperform the Bayesian estimators in terms of four measures of error. Since the regression models use data from 2004 as well as 2005, while Browns estimators were based strictly on 2005 data, we also compare the performance of the linear models to that of the Bayesian estimators when the Bayesian estimators are based on the same amount of data. We find the linear models to be superior in this case as well. As a further test, we use the same methods to predict on-base percentages in the last half of the 2005 season, and we find that the linear models again do a better job. While we change the question proposed in Browns original paper, our results are a valuable reminder of the power of linear regression.
Year of publication: |
2010
|
---|---|
Authors: | Dan, Neal ; James, Tan ; Feng, Hao ; S, Wu Samuel |
Published in: |
Journal of Quantitative Analysis in Sports. - De Gruyter, ISSN 1559-0410. - Vol. 6.2010, 3, p. 1-14
|
Publisher: |
De Gruyter |
Saved in:
Saved in favorites
Similar items by person
-
Score Statistics for Mapping Quantitative Trait Loci
N, Chang Myron, (2009)
-
Hao Feng, (2010)
-
School quality and housing prices : empirical evidence from a natural experiment in Shanghai, China
Hao Feng, (2013)
- More ...