Page 10 - DCAP606_BUSINESS_INTELLIGENCE
P. 10
Unit 1: Introduction to Business Intelligence
Deriving new calculated values (sale price = price - discount). Notes
Merging data from multiple sources.
Summarizing (aggregating) certain rows and columns.
Splitting a column into multiple columns.
Resolving discrepancies between similar data items.
Validating the data.
Figure 1.3 Shows Representation of ETL Model
Figure 1.3: Representation of ETL Model
Source: http://3.bp.blogspot.com/_tutW43y628U/TL2I-JTIFAI/AAAAAAAAAEI/mir1v2EMiTg/
s1600/ETL_Global.jpg
The ETL function permits the consolidation of multiple data sources into a well-structured
database for use in complex analysis. The ETL process is performed occasionally, such as daily,
weekly, or monthly, depending upon the enterprise needs. This method is called offline ETL
because the key database is not relentlessly updated. It is revised on a periodic batch basis.
Though offline ETL serves its purpose well, it has some drawbacks as well:
The data in the data warehouse is not fresh. It could be weeks old. Though, it is useful for
strategic functions but is not especially adaptable to tactical use.
The source database typically should be temporary inactive throughout the extract method.
Otherwise, the target database is in an inconsistent state following the load. With this
result, the applications must be shutdown, often for hours.
In order to develop to support real-time business intelligence, the ETL function must be relentless
and non-invasive, which is called online ETL, and is recounted later. In compare to offline ETL,
which supplies data which is not fresh but reliable answers to queries, online ETL supplies
present but varying answers to successive queries since the data that it is using is constantly
being updated to reflect the current state of the business.
LOVELY PROFESSIONAL UNIVERSITY 5