Page 88 - DCAP208_Management Support Systems
P. 88
Unit 6: Data Warehousing
they mark out the information required by a specific group of users to solve queries; Notes
they can deliver better performance because they are smaller than primary data
warehouses.
Sometimes, mainly for organization and policy purposes, you should use a different architecture
in which sources are used to directly populate data marts. These data marts are called independent.
If there is no primary data warehouse, this streamlines the design process, but it leads to the risk
of inconsistencies between data marts. To avoid these problems, you can create a primary data
warehouse and still have independent data marts. In comparison with the standard two-layer
architecture of Figure 6.3, the roles of data marts and data warehouses are actually inverted.
In this case, the data warehouse is populated from its data marts, and it can be directly queried
to make access patterns as easy as possible.
The following list sums up all the benefits of a two-layer architecture, in which a data warehouse
separates sources from analysis applications:
In data warehouse systems, good quality information is always available, even when
access to sources is denied temporarily for technical or organizational reasons.
Data warehouse analysis queries do not affect the management of transactions, the
reliability of which is vital for enterprises to work properly at an operational level.
Data warehouses are logically structured according to the multidimensional model, while
operational sources are generally based on relational or semi-structured models.
A mismatch in terms of time and granularity occurs between OLTP systems, which manage
current data at a maximum level of detail, and OLAP systems, which manage historical
and summarized data.
Data warehouses can use specific design solutions aimed at performance optimization of
analysis and report applications.
Task Analyze the use of primary data warehouse.
6.3.3 Three-Layer Architecture
In this architecture, the third layer is the reconciled data layer or operational data store. This
layer materializes operational data obtained after integrating and cleansing source data. As a
result, those data are integrated, consistent, correct, current, and detailed. Figure 6.4 shows a
data warehouse that is not populated from its sources directly, but from reconciled data. The
main advantage of the reconciled data layer is that it creates a common reference data model for
a whole enterprise. At the same time, it sharply separates the problems of source data extraction
and integration from those of data warehouse population. Remarkably, in some cases, the
reconciled layer is also directly used to better accomplish some operational tasks, such as
producing daily reports that cannot be satisfactorily prepared using the corporate applications,
or generating data flows to feed external processes periodically so as to benefit from cleaning
and integration. However, reconciled data leads to more redundancy of operational source data.
Notes Note that we may assume that even two-layer architectures can have a reconciled
layer that is not specifically materialized, but only virtual, because it is defined as a
consistent integrated view of operational source data.
LOVELY PROFESSIONAL UNIVERSITY 81