Page 10 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 10

Data Warehousing and Data Mining




                    notes          1.1.3 Differences between operational Database systems and
                                            Data Warehouses

                                   The first major stepping stone in understanding Data Warehousing is to grasp the concepts and
                                   differences between the two overall database categories. The type most of us are used to dealing
                                   with is the On Line Transactional Processing (OLTP) category. The other major category is On
                                   Line Analytical Processing (OLAP).
                                   OLTP is what we characterise as the ongoing day-to-day functional copy of the database. It is
                                   where data is added and updated but never overwritten or deleted. The main needs of the OLTP
                                   operational database being easily controlled insertion and updating of data with efficient access
                                   to data manipulation and viewing mechanisms. Typically only single record or small record-sets
                                   should be manipulated in a single operation in an OLTP designed database. The main thrust
                                   here is to avoid having the same data in different tables. This basic tenet of Relational Database
                                   modeling is known as “normalising” data.

                                   OLAP is a broad term that also encompasses data warehousing. In this model data is stored
                                   in a format, which enables the efficient creation of data mining/reports. OLAP design should
                                   accommodate reporting on very large record sets with little degradation in operational efficiency.
                                   The overall term used to describe taking data structures in an OLTP format and holding the same
                                   data in an OLAP format is “Dimensional Modeling” It is the primary building block of Data
                                   Warehousing.
                                   The major distinguishing features between OLTP and OLAP are summarised as follows:

                                        feature            oLtp system                    oLap system
                                   Characteristic  Operational Processing       Informational Processing
                                   Users           Clerks,   clients,   and   information  Knowledge  workers,  including  managers,
                                                   technology professionals.    executives, and analysts.
                                   System orientation  Customer  oriented  and  used  for  Market-oriented and used for data analysis
                                                   transaction and query processing Day  long  term  informational  requirements,
                                                   to day operations            decision support.
                                   Data contents   Manages  current  data  that  typically,  Manages  large  amounts  of  historical
                                                   are  too  detailed  to  be  easily  used  for  data,  provides  facilities  for  summa-
                                                   decision making.             risation  and  aggregation,  and  stores  and
                                                                                manages information at different levels of
                                                                                granularity.
                                   Database design  Adopts  an  entity-relationship  (ER)  Adopts  either  a  star  or  snowflake  model
                                                   data  model  and  an  application-  and a subject-oriented database design.
                                                   oriented database design
                                   View            Focuses  mainly  on  the  current  data  In  contrast,  an  OLAP  system  often  spans
                                                   within  an  enterprise  or  department,  multiple  versions  of  a  database  schema,
                                                   without referring to historical data or  due  to  the  evolutionary  process  of  an
                                                   data in different organisations.  organisation.  OLAP  systems  also  deal
                                                                                with  information  that  originates  from
                                                                                different   organisations,   integrating
                                                                                information from many data stores.
                                   Volume of data  Not very large               Because of their huge volume, OLAP data
                                                                                are stored on multiple storage media.
                                   Access patterns  Consists  mainly  of  short,  atomic  Accesses  to  OLAP  systems  are  mostly
                                                   transactions.  Such  a  system  requires  read-only  operations  (since  most  data
                                                   concurrency  control  and  recovery  warehouses store historical rather than up-
                                                   mechanisms.                  to-date information), although many could
                                                                                be complex queries.
                                                                                                          Contd...





          4                                LoveLy professionaL university
   5   6   7   8   9   10   11   12   13   14   15