Page 167 - DMGT505_MANAGEMENT_INFORMATION_SYSTEM
P. 167
Management Information Systems
Notes 5. Data warehousing is also very useful from the point of view of heterogeneous database
integration. Many organisations typically collect diverse kinds of data and maintain large
databases from multiple, heterogeneous, autonomous, and distributed information sources.
Query Driven Approach versus Update Driven Approach for Heterogeneous
Database Integration
For heterogeneous database integration, the traditional database implements query-driven
approach, which requires complex information filtering and integration processes, and competes
for resources with processing at local sources. It is inefficient and potentially expensive for
frequent queries, especially for queries requiring aggregations.
In query-driven approach, data warehousing employs an update-driven approach in which
information from multiple, heterogeneous sources is integrated in advance and stored in a
warehouse for direct querying and analysis. In this approach, a data warehouse brings high
performance to the integrated heterogeneous database system since data are copied, preprocessed,
integrated, annotated, summarized, and restructured into one semantic data store. Furthermore,
query processing in data warehouses does not interfere with the processing at local sources.
Moreover, data warehouses can store and integrate historical information and support complex
multidimensional queries. As a result, data warehousing has become very popular in industry.
Differences between Operational Database Systems and Data Warehouses
The first major stepping stone in understanding Data Warehousing is to grasp the concepts and
differences between the two overall database categories. The type most of us are used to dealing
with is the On Line Transactional Processing (OLTP) category. The other major category is On
Line Analytical Processing (OLAP).
OLTP is what we characterise as the ongoing day-to-day functional copy of the database. It is
where data is added and updated but never overwritten or deleted. The main needs of the OLTP
operational database being easily controlled insertion and updating of data with efficient access
to data manipulation and viewing mechanisms. Typically only single record or small record-
sets should be manipulated in a single operation in an OLTP designed database. The main thrust
here is to avoid having the same data in different tables. This basic tenet of Relational Database
modeling is known as “normalising” data.
OLAP is a broad term that also encompasses data warehousing. In this model data is stored in a
format, which enables the efficient creation of data mining/reports. OLAP design should
accommodate reporting on very large record sets with little degradation in operational efficiency.
The overall term used to describe taking data structures in an OLTP format and holding the same
data in an OLAP format is “Dimensional Modeling” It is the primary building block of Data
Warehousing.
The major distinguishing features between OLTP and OLAP are summarised as follows:
162 LOVELY PROFESSIONAL UNIVERSITY