Page 272 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 272
Data Warehousing and Data Mining
notes 14.3.4 Data Mining is a Data Warehouse tool
Data mining is a technology that applies sophisticated and complex algorithms to analyze data
and expose interesting information for analysis by decision makers. Whereas OLAP organizes
data in a model suited for exploration by analysts, data mining performs analysis on data and
provides the results to decision makers. Thus, OLAP supports model-driven analysis and data
mining supports data-driven analysis.
Data mining has traditionally operated only on raw data in the data warehouse database or,
more commonly, text files of data extracted from the data warehouse database. In SQL Server
2000, Analysis Services provides data mining technology that can analyze data in OLAP cubes,
as well as data in the relational data warehouse database. In addition, data mining results can
be incorporated into OLAP cubes to further enhance model-driven analysis by providing an
additional dimensional viewpoint into the OLAP model. For example, data mining can be used
to analyze sales data against customer attributes and create a new cube dimension to assist the
analyst in the discovery of the information embedded in the cube data.
14.3.5 Designing a Data Warehouse: prerequisites
Before embarking on the design of a data warehouse, it is imperative that the architectural goals
of the data warehouse be clear and well understood. Because the purpose of a data warehouse
is to serve users, it is also critical to understand the various types of users, their needs, and the
characteristics of their interactions with the data warehouse.
Data Warehouse Architecture Goals
A data warehouse exists to serve its users analysts and decision makers. A data warehouse must
be designed to satisfy the following requirements:
1. Deliver a great user experience user acceptance is the measure of success
2. Function without interfering with OLTP systems
3. Provide a central repository of consistent data
4. Answer complex queries quickly
5. Provide a variety of powerful analytical tools, such as OLAP and data mining
Most successful data warehouses that meet these requirements have these common
characteristics:
1. Are based on a dimensional model
2. Contain historical data
3. Include both detailed and summarized data
4. Consolidate disparate data from multiple sources while retaining consistency
5. Focus on a single subject, such as sales, inventory, or finance
Data warehouses are often quite large. However, size is not an architectural goal it is a characteristic
driven by the amount of data needed to serve the users.
14.3.6 Data Warehouse users
The success of a data warehouse is measured solely by its acceptance by users. Without users,
historical data might as well be archived to magnetic tape and stored in the basement. Successful
data warehouse design starts with understanding the users and their needs.
266 LoveLy professionaL university