Page 98 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 98

Data Warehousing and Data Mining




                    notes          5.2.2 Modelling aspects

                                   Data reconciliation for DataSources allows you to check the integrity of the loaded data by, for
                                   example, comparing the totals of a key figure in the DataStore object with the corresponding
                                   totals that the VirtualProvider accesses directly in the source system.
                                   In addition, you can use the extractor or extractor error interpretation to identify potential errors
                                   in the data processing. This function is available if the data reconciliation DataSource uses a
                                   different extraction module to the productive DataSource.
                                   We recommend that you keep the volume of data transferred as small as possible because the data
                                   reconciliation DataSource accesses the data in the source system directly. This is best performed
                                   using a data reconciliation DataSource delivered by business intelligence Content or a generic
                                   DataSource using function modules because this allows you to implement an aggregation logic.
                                   For mass data, you generally need to aggregate the data or make appropriate selections during
                                   extraction.
                                   The data reconciliation DataSource has to provide selection fields that allow the same set of data
                                   to be extracted as the productive DataSource.



                                      Task     “The source systems for a data warehouse are typically transaction processing
                                     applications.” Discuss


                                   5.3 Data aggregation and customization

                                   Data aggregation is any process in which information is gathered and expressed in a summary
                                   form, for purposes such as statistical analysis. A common aggregation purpose is to get more
                                   information  about  particular  groups  based  on  specific  variables  such  as  age,  profession,  or
                                   income. The information about such groups can then be used for Web site personalization to
                                   choose content and advertising likely to appeal to an individual belonging to one or more groups
                                   for which data has been collected.

                                          Example: A site that sells music CDs might advertise certain CDs based on the age of the
                                   user and the data aggregate for their age group. Online analytic processing (OLAP) is a simple
                                   type of data aggregation in which the marketer uses an online reporting mechanism to process
                                   the information.
                                   Data aggregation can be user-based: personal data aggregation services offer the user a single
                                   point for collection  of their  personal information from other Web  sites. The customer uses a
                                   single master personal identification number (PIN) to give them access to their various accounts
                                   (such as those for financial institutions, airlines, book and music clubs, and so on). Performing
                                   this type of data aggregation is sometimes referred to as “screen scraping.”


















          92                               LoveLy professionaL university
   93   94   95   96   97   98   99   100   101   102   103