Page 199 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 199

Unit 10: Multi-dimensional Data Models and Aggregation




          The major difference between the snowflake and star schema models is that the dimension tables   notes
          of the snowflake model may be kept in normalised form. Such a table is easy to maintain and also
          saves storage space because a large dimension table can be extremely large when the dimensional
          structure is included as columns. Since much of this space is redundant data, creating a normalised
          structure will reduce the overall space requirement. However, the snowflake structure can reduce
          the effectiveness of browsing since more joins will be needed to execute a query. Consequently,
          the system performance may be adversely impacted. Performance benchmarking can be used to
          determine what is best for your design.


                 Example:  In  Snowflake  schema,  the  example  diagram  shown  in  Figure  10.6  has  4
          dimension tables, 4 lookup tables and 1 fact table. The reason is that hierarchies (category, branch,
          state, and month) are being broken out of the dimension tables (PRODUCT, ORGANISATION,
          LOCATION, and TIME) respectively and shown separately. In OLAP, this Snowflake schema
          approach increases the number of joins and poor performance in retrieval of data.

                                 Figure 10.6: Example of Snowflake Schema










































          A compromise between the star schema and the snowflake schema is to adopt a mixed schema
          where only the very large dimension tables are normalised. Normalising large dimension tables
          saves storage space, while keeping small dimension tables unnormalised may reduce the cost and
          performance degradation due to joins on multiple dimension tables. Doing both may lead to an
          overall performance gain. However, careful performance tuning could be required to determine
          which dimension tables should be normalised and split into multiple tables.







                                           LoveLy professionaL university                                   193
   194   195   196   197   198   199   200   201   202   203   204