Page 8 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 8
Data Warehousing and Data Mining
notes introduction
Remember using Lotus 1-2-3-? This was your first taste of “What if?” processing on the desktop.
This is what a data warehouse is all about-using information your business has gathered to help
it react better, smarter, quicker and more efficiently.
To expand upon this definition, a data warehouse is a collection of corporate information,
derived directly from operational systems and some external data sources. Its specific purpose is
to support business decisions, not business operations. This is what a data warehouse is all about,
helping your business ask “What if?” questions. The answers to these questions will ensure your
business is proactive, instead of reactive, a necessity in today’s information age.
The industry trend today is moving towards more powerful hardware and software configurations.
With these more powerful configurations, we now have the ability to process vast volumes
of information analytically, which would have been unheard of ten or even five years ago.
A business today must be able to use this emerging technology or rum the risk of being information
under-loaded. You read that correctly under-loaded the opposite of overloaded. Overloaded
means you are so overwhelmed by the enormous gult of information. It’s hard to wade through
it to determine what is important. If you are under-loaded, you are information deficient. You
cannot cope with decision-making exceptions because you do not know where you stand. You
are missing critical pieces of information required to make informed decisions.
In today’s world, you do not want to be the country mouse. In today’s world, full of vast amounts
of unfiltered information, a business that does not effectively use technology to shift through that
information will not survive the information age. Access to and the understanding of information
is power. This power equates to a competitive advantage are survival.
1.1 What is a Data Warehouse?
Data warehouse provides architectures and tools for business executives to systematically
organise, understand, and use their data to make strategic decisions. In the last several years,
many firms have spent millions of dollars in building enterprise-wide data warehouses as it is
assumed a way to keep customers by learning more about their needs.
In simple terms, a data warehouse refers to a database that is maintained separately from an
organization’s operational databases. Data warehouse systems allow for the integration of
a variety of application systems. They support information processing by providing a solid
platform of consolidated, historical data for analysis.
According to W. H. Inman, a leading architect in the construction of data warehouse systems,
“a data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection
of data in support of management’s decision making process.” The four keywords, subject-
oriented, integrated, time-variant, and non-volatile, distinguish data warehouses from other data
repository systems, such as relational database systems, transaction processing systems, and file
systems. Let us understand the four key words in more detail as follows:
1. Subject-oriented: A data warehouse focuses on the modeling and analysis of data for
decision makers. Therefore, data warehouses typically provide a simple and concise view
around particular subject issues by excluding data that are not useful in the decision
support process.
2. Integrated: As the data warehouse is usually constructed by integrating multiple
heterogeneous sources, such as relational databases, flat files, and on-line transaction
records, the data cleaning and data integration techniques need to be applied to ensure
consistency in naming conventions, encoding structures, attribute measures, and so on.
2 LoveLy professionaL university