Page 13 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 13
Unit 1: Data Warehouse Practice
Data Warehouse notes
The data warehouse is normally (but does not have to be) a relational database. It must be
organized to hold information in a structure that best supports not only query and reporting, but
also advanced analysis techniques, like data mining. Most data warehouses hold information for
at least 1 year and sometimes can reach half century, depending on the business/operations data
retention requirement. As a result these databases can become very large.
reporting
The data in the data warehouse must be available to the organisation’s staff if the data warehouse
is to be useful. There are a very large number of software applications that perform this function,
or reporting can be custom-developed. Examples of types of reporting tools include:
1. Business intelligence tools: These are software applications that simplify the process of
development and production of business reports based on data warehouse data.
2. Executive information systems (known more widely as Dashboard (business): These are
software applications that are used to display complex business metrics and information
in a graphical way to allow rapid understanding.
3. OLAP Tools: OLAP tools form data into logical multi-dimensional structures and allow
users to select which dimensions to view data by.
4. Data Mining: Data mining tools are software that allow users to perform detailed
mathematical and statistical calculations on detailed data warehouse data to detect trends,
identify patterns and analyze data.
Metadata
Metadata, or “data about data”, is used not only to inform operators and users of the data
warehouse about its status and the information held within the data warehouse, but also as
a means of integration of incoming data and a tool to update and refine the underlying DW
model.
Example: Data warehouse metadata include table and column names, their detailed
descriptions, their connection to business meaningful names, the most recent data load date, the
business meaning of a data item and the number of users that are logged in currently.
operations
A data warehouse operation is comprised of the processes of loading, manipulating and
extracting data from the data warehouse. Operations also cover user management, security,
capacity management and related functions.
optional components
In addition, the following components exist in some data warehouses:
1. Dependent Data Marts: A dependent data mart is a physical database (either on the
same hardware as the data warehouse or on a separate hardware platform) that receives
all its information from the data warehouse. The purpose of a Data Mart is to provide a
sub-set of the data warehouse’s data for a specific purpose or to a specific sub-group of
the organization. A data mart is exactly like a data warehouse technically, but it serves a
different business purpose: it either holds information for only part of a company (such as
a division), or it holds a small selection of information for the entire company (to support
LoveLy professionaL university 7