A Pareto Model for OLAP View Size Estimation
On-Line Analytical Processing (OLAP) aims at gaining useful information quickly from large amounts of data residing in a data warehouse. To improve the quickness of response to queries, pre-aggregation is a useful strategy. However, it is usually impossible to pre-aggregate along all combinations of the dimensions. The multi-dimensional aspects of the data lead to combinatorial explosion in the number and potential storage size of the aggregates. We must selectively pre-aggregate. Cost/benefit analysis involves estimating the storage requirements of the aggregates in question. We present an original algorithm for estimating the number of rows in an aggregate based on the Pareto distribution model. We test the Pareto Model Algorithm empirically against four published algorithms, and conclude the Pareto Model Algorithm is consistently the best of these algorithms for estimating view size.
Year of publication: |
2003-04
|
---|---|
Authors: | Nadeau, Thomas P. |
Publisher: |
Kluwer Academic Publishers; Springer Science+Business Media |
Subject: | Economics / Management Science | Management of Computing and Information Systems | Systems Theory | Control | Operation Research/Decision Theory | Business Information Systems | Pareto distribution | OLAP | view size estimation | materialized view selection | Mathematics | Management | Industrial and Operations Engineering | Science | Business and Economics | Engineering |
Saved in:
Saved in favorites
Similar items by subject
-
Stochastic scheduling of parallel queues with set-up costs
Duenyas, Izak, (1995)
-
On the Introduction of an Agile, Temporary Workforce into a Tandem Queueing System
Kaufman, David L., (2005)
-
Base-stock control for single-product tandem make-to-stock systems
Duenyas, Izak, (1997)
- More ...