Page 10 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 10
Data Warehousing and Data Mining
notes 1.1.3 Differences between operational Database systems and
Data Warehouses
The first major stepping stone in understanding Data Warehousing is to grasp the concepts and
differences between the two overall database categories. The type most of us are used to dealing
with is the On Line Transactional Processing (OLTP) category. The other major category is On
Line Analytical Processing (OLAP).
OLTP is what we characterise as the ongoing day-to-day functional copy of the database. It is
where data is added and updated but never overwritten or deleted. The main needs of the OLTP
operational database being easily controlled insertion and updating of data with efficient access
to data manipulation and viewing mechanisms. Typically only single record or small record-sets
should be manipulated in a single operation in an OLTP designed database. The main thrust
here is to avoid having the same data in different tables. This basic tenet of Relational Database
modeling is known as “normalising” data.
OLAP is a broad term that also encompasses data warehousing. In this model data is stored
in a format, which enables the efficient creation of data mining/reports. OLAP design should
accommodate reporting on very large record sets with little degradation in operational efficiency.
The overall term used to describe taking data structures in an OLTP format and holding the same
data in an OLAP format is “Dimensional Modeling” It is the primary building block of Data
Warehousing.
The major distinguishing features between OLTP and OLAP are summarised as follows:
feature oLtp system oLap system
Characteristic Operational Processing Informational Processing
Users Clerks, clients, and information Knowledge workers, including managers,
technology professionals. executives, and analysts.
System orientation Customer oriented and used for Market-oriented and used for data analysis
transaction and query processing Day long term informational requirements,
to day operations decision support.
Data contents Manages current data that typically, Manages large amounts of historical
are too detailed to be easily used for data, provides facilities for summa-
decision making. risation and aggregation, and stores and
manages information at different levels of
granularity.
Database design Adopts an entity-relationship (ER) Adopts either a star or snowflake model
data model and an application- and a subject-oriented database design.
oriented database design
View Focuses mainly on the current data In contrast, an OLAP system often spans
within an enterprise or department, multiple versions of a database schema,
without referring to historical data or due to the evolutionary process of an
data in different organisations. organisation. OLAP systems also deal
with information that originates from
different organisations, integrating
information from many data stores.
Volume of data Not very large Because of their huge volume, OLAP data
are stored on multiple storage media.
Access patterns Consists mainly of short, atomic Accesses to OLAP systems are mostly
transactions. Such a system requires read-only operations (since most data
concurrency control and recovery warehouses store historical rather than up-
mechanisms. to-date information), although many could
be complex queries.
Contd...
4 LoveLy professionaL university