Ein einfaches Verfahren zur Identifikation von Ausreißern bei multivariaten Daten
Statistical analysis is often disturbed by objects which are extremely different fromthe rest of the data. Those outliers can be due to different causes. Therefore it is alwaysrecommended to examine them separately.Outliers in one or two-dimensional cases are easily recognized in a frequency distribution.In multidimensional data they can be identified by sub-ordering. It is proposedto do this by calculating pairwise distances. The necessary standardization of thevariables can be done by using the sum of all pairwise distances.Proceeding this way possible outliers can easily be identified in tabular as well as ingraphical form. It can also be demonstrated which dimension, that is which variablesare contributing to the outlier status. As it is quite simple to remove any object or variable,one can see what is happening to the rest of the data without those extremevalues.
Corporate statistics and corporate cost accounting ; Business data processing. General ; Individual Working Papers, Preprints ; No country specification