Page 240 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 240
Data Warehousing and Data Mining
notes
figure 12.4: a process Model for Data Warehouses
As an example for a data warehouse process we have partially modeled the data warehouse
loading process in Figure 12.5. The loading process is composed of several steps, of which one
in our example is data cleaning. The data cleaning process step works on a data store, where
the data which have to be cleaned reside. It is executed by some data cleaning agent. It affects
among others the quality factors accuracy and availability, in the sense that accuracy is hopefully
improved and availability is decreased because of locks due to read-write operations on the data
store. The data cleaning process may also store some results of its execution in the metadata
repository, for example, a boolean value to represent the successful completion of the process
and the number of changed tuples in the data store.
figure 12.5: an example for a Data Warehouse process pattern
The information stored in the repository may be used to find deficiencies in data warehouse.
To show the usefulness of this information we use the following query. It returns all data
cleaning processes which have decreased the availability of a data store according to the stored
measurements. The significance of the query is that it can show that the implementation of data
cleaning process has become inefficient.
GenericQueryClass DecreasedAvailability
isA DWCleaningProcess with
parameter
234 LoveLy professionaL university