Page 256 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 256
Data Warehousing and Data Mining
notes 13.2 Quality analysis in Data staging
In the data warehousing process, the data staging area is composed of the data staging server
application and the data store archive (repository) of the results of extraction, transformation
and loading activity. The data staging application server temporarily stores and transforms
data extracted from OLTP data sources and the archival repository stores cleaned, transformed
records and attributes for later loading into data marts and data warehouses.
13.2.1 the Data staging process
The data staging process imports data either as streams or files, transforms it, produces integrated,
cleaned data and stages it for loading into data warehouses, data marts, or Operational Data
Stores.
Kimball et.al. distinguish two data staging scenarios:
A data staging tool is available, and the data is already in a database. The data flow is set up
so that it comes out of the source system, moves through the transformation engine, and into a
staging database. The flow is illustrated in Figure 13.1.
figure 13.1: first Data staging scenario
In the second scenario, begin with a mainframe legacy system. Then extract the sought after data
into a flat file, move the file to a staging server, transform its contents, and load transformed data
into the staging database. Figure 13.2 illustrates this scenario.
figure 13.2: second Data staging scenario
250 LoveLy professionaL university