Page 266 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 266

Data Warehousing and Data Mining




                    notes          Despite the fact, that this data warehouse talks “only” about financial figures, there is a host of
                                   semantic coherency questions to be solved between the different accounting definitions required
                                   by tax laws, stock exchanges, different financial products, and the like. At the same time, there
                                   are massive physical data integration problems to be solved by re-calculating ten thousands of
                                   multi-dimensional data cubes on a daily basis to have close to zero-latency information for top
                                   management. In light of such problems, many architectures discussed in the literature appear
                                   somewhat naive.
                                   The key to solving these enormous problems in a flexible and evolvable manner is enriched
                                   metadata  management,  used  by  different  kinds  of  interacting  software  components.  In  the
                                   following section, we shall present our approach how to organize this.

                                   14.2 interaction between Quality factors and DW tasks

                                   Starting  from  a  definition  of  the  basic  DW  architecture  and  the  relevant  data  quality  issues,
                                   the first project goal is to define the range of design and operational method alternatives for
                                   each of the main architecture components and quality factors. Since usually a combination of
                                   enabling technologies is required, innovations are envisioned both at the design (e.g., rich meta-
                                   data representation and reasoning facilities) as well as at the operational level (e.g., viewing DW
                                   contents as views over the underlying information sources, refreshment techniques and optimal
                                   handling of views with aggregate functions become important). In a second step, formal models of
                                   the DW architecture and services will be developed together with associated tools for consistency
                                   checking in the richer model, reuse by subsumption, view materialization strategies, and other
                                   components of the data warehousing software. These models and tools will make the knowledge
                                   about operational alternatives and their configuration available to the data warehouse designer,
                                   in order to allow the dynamic adaptation of the data warehouse structure and quality-of-service
                                   to the ever-changing information sources and analysis patterns.
                                   The increased accessibility of information over wide-area networks does not solve the problem to
                                   have the right information in the right place at the right time with the right cost.

                                   Data warehousing has become an important strategy to integrate heterogeneous information
                                   sources  in  organizations,  and  to  enable  on-line  analytic  processing.  Their  development  is  a
                                   consequence of the observation by W. Inmon and E. F. Codd in the early 1990’s that operational-
                                   level on-line transaction processing (OLTP) and decision support applications (on-line analytic
                                   processing or OLAP) cannot co-exist efficiently in the same database environment, mostly due to
                                   their very different transaction characteristics.
                                   A DW caches selected data of interest to a customer group, so that access becomes faster, cheaper
                                   and more effective. As the long-term buffer between OLTP and OLAP (Figure 14.3), DW’s face two
                                   essential questions: how to reconcile the stream of incoming data from multiple heterogeneous
                                   legacy sources, and how to customize derived data storage to specific OLAP applications. The
                                   trade-offs driving the design decisions concerning these two issues change continuously with
                                   business needs, therefore design support and change management are of greatest importance
                                   if we do not want to run DW projects into dead ends. This is a recognized problem in industry
                                   which is not solvable without improved formal foundations.


















          260                              LoveLy professionaL university
   261   262   263   264   265   266   267   268   269   270   271