Page 186 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 186
Data Warehousing and Data Mining
notes
figure 9.6: second example of refreshment scenario
When the refreshment activities are long term activities or when the DWA wants to apply
validation procedures between activities, temporal events or activity terminations can be used
to synchronize all the refreshment process. In general, the quality requirements may impose a
certain synchronization strategy. For example, if users desire high freshness for data, this means
that each update in a source should be mirrored as soon as possible to the views. Consequently,
this determines the strategy of synchronization: trigger the extraction after each change in a
source, trigger the integration, when semantically relevant, after the commit of each data source,
propagate changes through views immediately after integration, and customize the user views
in data marts.
Refreshment Scheduling
The refreshment process can be viewed through different perspectives:
1. Client-driven refreshment which describes part of the process which is triggered on demand
by the users. This part mainly concern update propagation from the ODS to the aggregated
views. The on-demand strategy can be defined for all aggregated views or only for those
for which the freshness of data is related to the date of querying.
2. Source-driven refreshment which defines part of the process which is triggered by changes
made in the sources. This part concerns the preparation phase. The independence between
sources can be used as a way to define different preparation strategies, depending on
the sources. Some sources may be associated with cleaning procedures, others not. Some
sources need a history of the extracted data, others not. For some sources, the cleaning is
done on the fly during the extraction, for some others after the extraction or on the history
180 LoveLy professionaL university