I am not, nor have I ever been a member of a data-mining discipline
This paper argues classical statistics and standard econometrics are based on a desire to meet scientific standards for accumulating reliable knowledge. Science requires two inputs, mining of existing data for inspiration and new or 'out-of-sample' data for predictive testing. Avoidance of data-mining is neither possible nor desirable. In economics out-of-sample data is relatively scarce, so the production process should intensively exploit the existing data. But the two inputs should be thought of as complements rather than substitutes. And we neglect the importance of out-of-sample testing in the production of reliable knowledge. Avoidance of data-mining is not a substitute for tests conducted in new samples. The problem is not that data-mining corrupts the process, the problem is our collective neglect of out-of-sample encompassing, stability and forecast tests. So the data-mining issue diverts us from the crucial margin.
Year of publication: |
2001
|
---|---|
Authors: | Greene, Clinton |
Published in: |
Journal of Economic Methodology. - Taylor & Francis Journals, ISSN 1350-178X. - Vol. 7.2001, 2, p. 217-230
|
Publisher: |
Taylor & Francis Journals |
Subject: | Repeated Testing | Stability | Time Series | Experimental | Data-MINING | Applied Methods | Prediction |
Saved in:
Saved in favorites
Similar items by subject
-
Crowd-squared : amplifying the predictive power of search trend data
Brynjolfsson, Erik, (2016)
-
Using high-dimensional corporate governance variables to predict firm performance
Nicholas, Benes, (2024)
-
Are all text news just a noise for investors? : impact of online texts on Bitcoin returns
Damjanović, Aleksandar, (2023)
- More ...
Similar items by person