Page 13 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 13

Unit 1: Data Warehouse Practice




          Data Warehouse                                                                        notes

          The  data  warehouse  is  normally  (but  does  not  have  to  be)  a  relational  database.  It  must  be
          organized to hold information in a structure that best supports not only query and reporting, but
          also advanced analysis techniques, like data mining. Most data warehouses hold information for
          at least 1 year and sometimes can reach half century, depending on the business/operations data
          retention requirement. As a result these databases can become very large.

          reporting

          The data in the data warehouse must be available to the organisation’s staff if the data warehouse
          is to be useful. There are a very large number of software applications that perform this function,
          or reporting can be custom-developed. Examples of types of reporting tools include:
          1.   Business intelligence tools: These are software applications that simplify the process of
               development and production of business reports based on data warehouse data.
          2.   Executive information systems (known more widely as Dashboard (business): These are
               software applications that are used to display complex business metrics and information
               in a graphical way to allow rapid understanding.
          3.   OLAP Tools: OLAP tools form data into logical multi-dimensional structures and allow
               users to select which dimensions to view data by.

          4.   Data  Mining:  Data  mining  tools  are  software  that  allow  users  to  perform  detailed
               mathematical and statistical calculations on detailed data warehouse data to detect trends,
               identify patterns and analyze data.

          Metadata

          Metadata,  or  “data  about  data”,  is  used  not  only  to  inform  operators  and  users  of  the  data
          warehouse about its status and the information held within the data warehouse, but  also as
          a means of integration of incoming data and a tool to update and refine the underlying DW
          model.

                 Example:  Data warehouse metadata include table and column names, their detailed
          descriptions, their connection to business meaningful names, the most recent data load date, the
          business meaning of a data item and the number of users that are logged in currently.

          operations

          A  data  warehouse  operation  is  comprised  of  the  processes  of  loading,  manipulating  and
          extracting  data  from  the  data  warehouse.  Operations  also  cover  user  management,  security,
          capacity management and related functions.

          optional components

          In addition, the following components exist in some data warehouses:
          1.   Dependent  Data  Marts:  A  dependent  data  mart  is  a  physical  database  (either  on  the
               same hardware as the data warehouse or on a separate hardware platform) that receives
               all its information from the data warehouse. The purpose of a Data Mart is to provide a
               sub-set of the data warehouse’s data for a specific purpose or to a specific sub-group of
               the organization. A data mart is exactly like a data warehouse technically, but it serves a
               different business purpose: it either holds information for only part of a company (such as
               a division), or it holds a small selection of information for the entire company (to support



                                           LoveLy professionaL university                                     7
   8   9   10   11   12   13   14   15   16   17   18