Page 195 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 195
Unit 10: Multi-dimensional Data Models and Aggregation
Logical Dimensions: Dimensions contain a set of unique values that identify and categorise data. notes
They form the edges of a logical cube, and thus of the measures within the cube. Because measures
are typically multidimensional, a single value in a measure must be qualified by a member of
each dimension to be meaningful. For example, the Sales measure has four dimensions: Time,
Customer, Product, and Channel. A particular Sales value (43,613.50) only has meaning when it
is qualified by a specific time period (Feb-01), a customer (Warren Systems), a product (Portable
PCs), and a channel (Catalog).
Logical Hierarchies and Levels: A hierarchy is a way to organise data at different levels of
aggregation. In viewing data, analysts use dimension hierarchies to recognise trends at one level,
drill down to lower levels to identify reasons for these trends, and roll up to higher levels to see
what affect these trends have on a larger sector of the business.
Each level represents a position in the hierarchy. Each level above the base (or most detailed)
level contains aggregate values for the levels below it. The members at different levels have a
one-to-many parent-child relation. For example, Q1-2002 and Q2-2002 are the children of 2002,
thus 2002 is the parent of Q1-2002 and Q2-2002.
Suppose a data warehouse contains snapshots of data taken three times a day, that is, every
8 hours. Analysts might normally prefer to view the data that has been aggregated into days,
weeks, quarters, or years. Thus, the Time dimension needs a hierarchy with at least five levels.
Similarly, a sales manager with a particular target for the upcoming year might want to allocate
that target amount among the sales representatives in his territory; the allocation requires a
dimension hierarchy in which individual sales representatives are the child values of a particular
territory.
Hierarchies and levels have a many-to-many relationship. A hierarchy typically contains several
levels, and a single level can be included in more than one hierarchy
Logical Attributes: An attribute provides additional information about the data. Some attributes
are used for display. For example, you might have a product dimension that uses Stock Keeping
Units (SKUs) for dimension members. The SKUs are an excellent way of uniquely identifying
thousands of products, but are meaningless to most people if they are used to label the data in a
report or graph. You would define attributes for the descriptive labels.
Multi-dimensional Data storage in analytic Workspaces
In the logical multidimensional model, a cube represents all measures with the same shape, that
is, the exact same dimensions. In a cube shape, each edge represents a dimension. The dimension
members are aligned on the edges and divide the cube shape into cells in which data values are
stored.
In an analytic workspace, the cube shape also represents the physical storage of multidimensional
measures, in contrast with two-dimensional relational tables. An advantage of the cube shape is
that it can be rotated: there is no one right way to manipulate or view the data. This is an important
part of multidimensional data storage, calculation, and display, because different analysts need
to view the data in different ways. For example, if you are the Sales Manager, then you need to
look at the data differently from a product manager or a financial analyst.
Assume that a company collects data on sales. The company maintains records that quantify how
many of each product was sold in a particular sales region during a specific time period. You can
visualise the sales measure as the cube shown in Figure 10.2.
LoveLy professionaL university 189