Page 33 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 33

Unit 2: Data Mining Concept




                                                                                                notes
                  figure 2.3: typical framework of a Data Warehouse for a Manufacturing company
























          A data warehouse is usually modeled by a multidimensional database structure, where each
          dimension corresponds to an attribute or a set of attributes in the schema, and each cell stores the
          value of some aggregate measure, such as count or sales amount. The actual physical structure
          of a data warehouse may be a relational data store or a multidimensional data cube. A data cube
          provides a multidimensional view of data and allows the precomputation and fast accessing of
          summarised data.

          2.6.4 Data cube

          The  data  cube  has  a  few  alternative  names  or  a  few  variants,  such  as,  “multidimensional
          databases,”  “materialised  views,”  and  “OLAP  (On-Line  Analytical  Processing).”  The  general
          idea  of  the  approach  is  to  materialise  certain  expensive  computations  that  are  frequently
          inquired, especially those involving aggregate functions, such as count, sum, average, max, etc.,
          and to store such materialised views in a multi-dimensional database (called a “data cube”) for
          decision support, knowledge discovery, and many other applications. Aggregate functions can
          be precomputed according to the grouping by different sets or subsets of attributes. Values in
          each attribute may also be grouped into a hierarchy or a lattice structure.


                 Example:  “Date”  can  be  grouped  into  “day”,  “month”,  “quarter”,  “year”  or  “week”,
          which forms a lattice structure.
          Generalisation  and  specialisation  can  be  performed  on  a  multiple  dimensional  data  cube  by
          “roll-up” or “drill-down” operations, where a roll-up operation reduces the number of dimensions
          in  a  data  cube  or  generalises  attribute  values  to  high-level  concepts,  whereas  a  drill-down
          operation does the reverse. Since many aggregate functions may often need to be computed
          repeatedly in data analysis, the storage of precomputed results in a multiple dimensional data
          cube  may  ensure  fast  response  time  and  flexible  views  of  data  from  different  angles  and  at
          different abstraction levels.














                                           LoveLy professionaL university                                    27
   28   29   30   31   32   33   34   35   36   37   38