Page 8 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 8

Data Warehousing and Data Mining




                    notes          introduction

                                   Remember using Lotus 1-2-3-? This was your first taste of “What if?” processing on the desktop.
                                   This is what a data warehouse is all about-using information your business has gathered to help
                                   it react better, smarter, quicker and more efficiently.

                                   To  expand  upon  this  definition,  a  data  warehouse  is  a  collection  of  corporate  information,
                                   derived directly from operational systems and some external data sources. Its specific purpose is
                                   to support business decisions, not business operations. This is what a data warehouse is all about,
                                   helping your business ask “What if?” questions. The answers to these questions will ensure your
                                   business is proactive, instead of reactive, a necessity in today’s information age.

                                   The industry trend today is moving towards more powerful hardware and software configurations.
                                   With  these  more  powerful  configurations,  we  now  have  the  ability  to  process  vast  volumes
                                   of  information  analytically,  which  would  have  been  unheard  of  ten  or  even  five  years  ago.
                                   A business today must be able to use this emerging technology or rum the risk of being information
                                   under-loaded.  You  read  that  correctly  under-loaded  the  opposite  of  overloaded.  Overloaded
                                   means you are so overwhelmed by the enormous gult of information. It’s hard to wade through
                                   it to determine what is important. If you are under-loaded, you are information deficient. You
                                   cannot cope with decision-making exceptions because you do not know where you stand. You
                                   are missing critical pieces of information required to make informed decisions.

                                   In today’s world, you do not want to be the country mouse. In today’s world, full of vast amounts
                                   of unfiltered information, a business that does not effectively use technology to shift through that
                                   information will not survive the information age. Access to and the understanding of information
                                   is power. This power equates to a competitive advantage are survival.

                                   1.1 What is a Data Warehouse?

                                   Data  warehouse  provides  architectures  and  tools  for  business  executives  to  systematically
                                   organise, understand, and use their data to make strategic decisions. In the last several years,
                                   many firms have spent millions of dollars in building enterprise-wide data warehouses as it is
                                   assumed a way to keep customers by learning more about their needs.

                                   In simple terms, a data warehouse refers to a database that is maintained separately from an
                                   organization’s  operational  databases.  Data  warehouse  systems  allow  for  the  integration  of
                                   a  variety  of  application  systems.  They  support  information  processing  by  providing  a  solid
                                   platform of consolidated, historical data for analysis.
                                   According to W. H. Inman, a leading architect in the construction of data warehouse systems,
                                   “a  data  warehouse  is  a  subject-oriented,  integrated,  time-variant,  and  nonvolatile  collection
                                   of  data  in  support  of  management’s  decision  making  process.”  The  four  keywords,  subject-
                                   oriented, integrated, time-variant, and non-volatile, distinguish data warehouses from other data
                                   repository systems, such as relational database systems, transaction processing systems, and file
                                   systems. Let us understand the four key words in more detail as follows:
                                   1.   Subject-oriented:  A  data  warehouse  focuses  on  the  modeling  and  analysis  of  data  for
                                       decision makers. Therefore, data warehouses typically provide a simple and concise view
                                       around  particular  subject  issues  by  excluding  data  that  are  not  useful  in  the  decision
                                       support process.

                                   2.   Integrated:  As  the  data  warehouse  is  usually  constructed  by  integrating  multiple
                                       heterogeneous  sources,  such  as  relational  databases,  flat  files,  and  on-line  transaction
                                       records, the data cleaning and data integration techniques need to be applied to ensure
                                       consistency in naming conventions, encoding structures, attribute measures, and so on.






          2                                LoveLy professionaL university
   3   4   5   6   7   8   9   10   11   12   13