Statistical sources of variable selection bias in classification tree algorithms based on the Gini index
Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the splitting criterion when plug-in estimates of entropy measures like the Gini Index are employed. The relevance of these sources of variable selection bias in the different simulation study designs is examined. Variable selection bias due to the explored sources applies to all classification tree algorithms based on empirical entropy measures like the Gini Index, Deviance and Information Gain, and to both binary and multiway splitting algorithms.
Year of publication: |
2005
|
---|---|
Authors: | Strobl, Carolin |
Publisher: |
München : Ludwig-Maximilians-Universität München, Sonderforschungsbereich 386 - Statistische Analyse diskreter Strukturen |
Saved in:
freely available
Series: | Discussion Paper ; 420 |
---|---|
Type of publication: | Book / Working Paper |
Type of publication (narrower categories): | Working Paper |
Language: | English |
Other identifiers: | 10.5282/ubm/epub.1789 [DOI] 485090155 [GVK] hdl:10419/31113 [Handle] |
Source: |
Persistent link: https://www.econbiz.de/10010266216
Saved in favorites
Similar items by person
-
A new method for detecting differential item functioning in the Rasch model
Strobl, Carolin, (2011)
-
Flexible Rasch Mixture Models with Package psychomix
Frick, Hannah, (2011)
-
On the estimation of standard errors in cognitive diagnosis models
Philipp, Michel, (2016)
- More ...