Page 183 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 183

Unit 9: Data Warehouse Refreshment – II




          describe the refreshment activities and their organization as a workflow. Then we give examples   notes
          of  different  workflow  scenarios  to  show  how  refreshment  may  be  a  dynamic  and  evolving
          process. Finally, we summarize the different perspectives through which a given refreshment
          scenario should be considered.
          The refreshment process is similar to the loading process in its data flow but, while the loading
          process  is  a  massive  feeding  of  the  data  warehouse,  the  refreshment  process  captures  the
          differential changes hold in the sources and propagates them through the hierarchy of data stores
          in the data warehouse. The preparation step extracts from each source the data that characterises
          the changes that have occurred in this source since the last extraction. As for loading, this data
          is  cleaned  and  possibly  archived  before  its  integration.  The  integration  step  reconciliates  the
          source  changes  coming  from  multiple  sources  and  adds  them  to  the  ODS.  The  aggregation
          step  recomputes  incrementally  the  hierarchy  of  aggregated  views  using  these  changes.  The
          customisation step propagates the summarized data to the data marts. As well as for the loading
          phase, this is a logical decomposition whose operational implementation receives many different
          answers in the data warehouse products. This logical view allows a certain traceability of the
          refreshment process. Figure 9.4 shows the activities of the refreshment process as well as a sample
          of the coordinating events.

                          Figure 9.4: The Generic Workflow for the Refreshment Process


































          In workflow systems, activities are coordinated by control flows which may be notification of
          process commitment, emails issued by agents, temporal events, or any other trigger events. In the
          refreshment process, coordination is done through a wide range of event types.
          You can distinguish several event types which may trigger and synchronize the refreshment
          activities. They might be temporal events, termination events or any other user-defined event.
          Depending on the refreshment scenario, one can choose an appropriate set of event types which
          allows to achieve the correct level of synchronization.
          Activities of the refreshment workflow are not executed as soon as they are triggered, they may
          depend on the current state of the input data stores. For example, if the extraction is triggered
          periodically, it is actually executed only when there are effective changes in the source log file. If



                                           LoveLy professionaL university                                   177
   178   179   180   181   182   183   184   185   186   187   188