Page 274 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 274

Data Warehousing and Data Mining




                    notes          14.4 optimizing and Materialization of DW views

                                   Among the techniques adopted in relational implementations of data warehouses to improve
                                   query performance, view materialization and indexing are presumably the most effective ones.
                                   Materialized  views  are  physical  structures  that  improve  data  access  time  by  pre-computing
                                   intermediary results. Then, user queries can be efficiently processed by using data stored within
                                   views and do not need to access the original data. Nevertheless, the use of materialized views
                                   requires additional storage space and entails maintenance overhead when refreshing the data
                                   warehouse.
                                   One of the most important issues in data warehouse physical design is to select an appropriate
                                   set of materialized views, called a configuration of views, which minimizes total query response
                                   time and the cost of maintaining the selected views, given a limited storage space. To achieve this
                                   goal, views that are closely related to the workload queries must be materialized.

                                   Materialized views are schema objects that can be used to summarize, precompute, replicate, and
                                   distribute data, e.g., to construct a data warehouse.
                                   A materialized view provides indirect access to table data by storing the results of a query in a
                                   separate schema object. Unlike an ordinary view, which does not take up any storage space or
                                   contain any data.

                                   The  motivation  for  using  materialized  views  is  to  improve  performance,  but  the  overhead
                                   associated with materialized view management can become a significant system management
                                   problem. When reviewing or evaluating some of the necessary materialized view management
                                   activities, consider some of the following:
                                   1.   Identifying what materialized views to create initially
                                   2.   Indexing the materialized views
                                   3.   Ensuring that all materialized views and materialized view indexes are refreshed properly
                                       each time the database is updated
                                   4.   Checking which materialized views have been used
                                   5.   Determining how effective each materialized view has been on workload performance

                                   6.   Measuring the space being used by materialized views
                                   7.   Determining which new materialized views should be created
                                   8.   Determining which existing materialized views should be dropped
                                   9.   Archiving old detail and materialized view data that is no longer useful

                                   After the initial effort of creating and populating the data warehouse or data mart, the major
                                   administration overhead is the update process, which involves:
                                   1.   Periodic extraction of incremental changes from the operational systems
                                   2.   Transforming the data

                                   3.   Verifying that the incremental changes are correct, consistent, and complete
                                   4.   Bulk-loading the data into the warehouse
                                   5.   Refreshing indexes and materialized views so that they are consistent with the detail data










          268                              LoveLy professionaL university
   269   270   271   272   273   274   275   276   277   278   279