Page 33 - DCAP603_DATAWARE_HOUSING_AND

Page 33 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING

P. 33

Unit 2: Data Mining Concept

notes
figure 2.3: typical framework of a Data Warehouse for a Manufacturing company

A data warehouse is usually modeled by a multidimensional database structure, where each
dimension corresponds to an attribute or a set of attributes in the schema, and each cell stores the
value of some aggregate measure, such as count or sales amount. The actual physical structure
of a data warehouse may be a relational data store or a multidimensional data cube. A data cube
provides a multidimensional view of data and allows the precomputation and fast accessing of
summarised data.

2.6.4 Data cube

The data cube has a few alternative names or a few variants, such as, “multidimensional
databases,” “materialised views,” and “OLAP (On-Line Analytical Processing).” The general
idea of the approach is to materialise certain expensive computations that are frequently
inquired, especially those involving aggregate functions, such as count, sum, average, max, etc.,
and to store such materialised views in a multi-dimensional database (called a “data cube”) for
decision support, knowledge discovery, and many other applications. Aggregate functions can
be precomputed according to the grouping by different sets or subsets of attributes. Values in
each attribute may also be grouped into a hierarchy or a lattice structure.

Example: “Date” can be grouped into “day”, “month”, “quarter”, “year” or “week”,
which forms a lattice structure.
Generalisation and specialisation can be performed on a multiple dimensional data cube by
“roll-up” or “drill-down” operations, where a roll-up operation reduces the number of dimensions
in a data cube or generalises attribute values to high-level concepts, whereas a drill-down
operation does the reverse. Since many aggregate functions may often need to be computed
repeatedly in data analysis, the storage of precomputed results in a multiple dimensional data
cube may ensure fast response time and flexible views of data from different angles and at
different abstraction levels.

LoveLy professionaL university 27

28 29 30 31 32 33 34 35 36 37 38