Automated detection and explanation of exceptional values in a datamining environment
In this paper, we describe an extension of the datamining framework with automated causal diagnosis, offering the possibility to automatically detect and explain exceptional values to support business decision tasks. This functionality can be built into the conventional OLAP (On-Line Analytical Processing) or datamining system using a generic explanation formalism, which mimics the work of business decision makers in diagnostic processes. The diagnostic process is now carried out manually by (business) analysts, where the analyst explores the multidimensional data to spot exceptions, and navigate the data to find the reasons for these exceptions. Such functionality can be provided by extending the conventional datamining system with an explanation formalism, which mimics the work of human decision makers in diagnostic processes. Here diagnosis is defined as finding the best explanation of unexpected behaviour (symptoms or exceptional values) of a system under study. This definition assumes that we know which behaviour we may expect from a correctly working system, otherwise we would not be able to determine whether the actual behaviour is what we expect it or not. The expected behaviour in a datamining environment can be derived from some statistical model or can be expert knowledge from analysts. The central goal is the identification of specific knowledge structures and reasoning methods required to construct computerized explanations from multidimensional data and business models. A methodology that automatically generates explanations for exceptional values in multidimensional business data is proposed. The methodology was tested on a case study involving the comparison of financial results of a firm’s business units