Page 171 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 171

Unit 9: Data Warehouse Refreshment – II




          9.1 update propagation into Materialized views                                        notes

          Typically, data flows from one or more online transaction processing (OLTP) database into a data
          warehouse on a monthly, weekly, or daily basis. The data is normally processed in a staging file
          before being added to the data warehouse. Data warehouses commonly range in size from tens
          of gigabytes to a few terabytes. Usually, the vast majority of the data is stored in a few very large
          fact tables.
          One  technique  employed  in  data  warehouses  to  improve  performance  is  the  creation  of
          summaries. Summaries are special types of aggregate views that improve query execution times
          by pre-calculating expensive joins and aggregation operations prior to execution and storing the
          results in a table in the database.

                 Example: You can create a table to contain the sums of sales by region and by product.

          The summaries or aggregates that are referred to in this book and in literature on data warehousing
          are created in Oracle Database using a schema object called a materialized view. Materialized views
          can perform a number of roles, such as improving query performance or providing replicated
          data.
          In the past, organizations using summaries spent a significant amount of time and effort creating
          summaries manually, identifying which summaries to create, indexing the summaries, updating
          them, and advising their users on which ones to use. The introduction of summary management
          eased the workload of the database administrator and meant the user no longer needed to be
          aware of the summaries that had been defined. The database administrator creates one or more
          materialized views, which are the equivalent of a summary. The end user queries the tables and
          views at the detail data level. The query rewrite mechanism in the Oracle server automatically
          rewrites the SQL query to use the summary tables. This mechanism reduces response time for
          returning results from the query. Materialized views within the data warehouse are transparent
          to the end user or to the database application.

          Although materialized views are usually accessed through the query rewrite mechanism, an end
          user or database application can construct queries that directly access the materialized views.
          However, serious consideration should be given to whether users should be allowed to do this
          because any change to the materialized views will affect the queries that reference them.

          Materialized views for Data Warehouses

          In data warehouses, you can use materialized views to pre-compute and store aggregated data
          such  as  the  sum  of  sales.  Materialized  views  in  these  environments  are  often  referred  to  as
          summaries, because they store summarized data. They can also be used to pre-compute joins
          with  or  without  aggregations.  A  materialized  view  eliminates  the  overhead  associated  with
          expensive joins and aggregations for a large or important class of queries.

          Materialized views for Distributed computing

          In distributed environments, you can use materialized views to replicate data at distributed sites
          and to synchronize updates done at those sites with conflict resolution methods. The materialized
          views as replicas provide local access to data that otherwise would have to be accessed from
          remote sites. Materialized views are also useful in remote data marts.

          Materialized views for Mobile computing

          You can also use materialized views to download a subset of data from central servers to mobile
          clients, with periodic refreshes and updates between clients and the central servers.




                                           LoveLy professionaL university                                   165
   166   167   168   169   170   171   172   173   174   175   176