Page 157 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 157

Unit 8: Data Warehouse Refreshment
          Gurwinder Kaur, Lovely Professional University



                         unit 8: Data Warehouse refreshment                                     notes




             contents
             Objectives
             Introduction
             8.1   Data Warehouse Refreshment
             8.2   Incremental Data Extraction
             8.3   Data Cleaning

                 8.3.1   Data Cleaning for Missing Values
                 8.3.2   Noisy Data
             8.4   Summary
             8.5   Keywords
             8.6   Self Assessment

             8.7   Review Questions
             8.8   Further Readings



          objectives

          After studying this unit, you will be able to:

          l z  Know data warehouse refreshment
          l z  Explain incremental data extraction
          l z  Describe data cleaning
          introduction


          A distinguishing characteristic of data warehouses is the temporal character of warehouse data,
          i.e., the management of histories over an extended period of time. Historical data is necessary for
          business trend analysis which can be expressed in terms of analysing the temporal development
          of real-time data. For the refreshment process, maintaining histories in the DWH means that
          either periodical snapshots of the corresponding operational data or relevant operational updates
          are propagated and stored in the warehouse, without overriding previous warehouse states.
          Extraction  is  the  operation  of  extracting  data  from  a  source  system  for  further  use  in  a  data
          warehouse environment. This is the first step of the ETL process. After the extraction, this data
          can be transformed and loaded into the data warehouse.
          The source systems for a data warehouse are typically transaction processing applications. For
          example, one of the source systems for a sales analysis data warehouse might be an order entry
          system that records all of the current order activities.











                                           LoveLy professionaL university                                   151
   152   153   154   155   156   157   158   159   160   161   162