Page 84 - DCAP208_Management Support Systems
P. 84
Unit 6: Data Warehousing
Notes
Figure 6.1: Data Warehouse Framework and Views
The major components of a data warehousing process are:
Data sources (legacy systems, ERP, OTLP, web logs)
Data extraction
Data loading
Comprehensive database (EDW)
Metadata
Middleware tools (enable access to the DW)
ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data
out of the source systems and placing it into a data warehouse.
ETL Technology is an important component of the Data Warehousing Architecture. It is used to
copy data from Operational Applications to the Data Warehouse Staging Area, from the DW
Staging Area into the Data Warehouse and finally from the Data Warehouse into a set of conformed
Data Marts that are accessible by decision makers.
The ETL software extracts data, transforms values of inconsistent data, cleanses “bad” data,
filters data and loads data into a target database. The scheduling of ETL jobs is critical. Should
there be a failure in one ETL job, the remaining ETL jobs must respond appropriately.
ETL involves the following tasks:
extracting the data from source systems (SAP, ERP, other operational systems), data from
different source systems is converted into one consolidated data warehouse format which
is ready for transformation processing.
transforming the data may involve the following tasks:
applying business rules
Example: So-called derivations, such as calculating new measures and dimensions.
LOVELY PROFESSIONAL UNIVERSITY 77