Page 98 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 98
Data Warehousing and Data Mining
notes 5.2.2 Modelling aspects
Data reconciliation for DataSources allows you to check the integrity of the loaded data by, for
example, comparing the totals of a key figure in the DataStore object with the corresponding
totals that the VirtualProvider accesses directly in the source system.
In addition, you can use the extractor or extractor error interpretation to identify potential errors
in the data processing. This function is available if the data reconciliation DataSource uses a
different extraction module to the productive DataSource.
We recommend that you keep the volume of data transferred as small as possible because the data
reconciliation DataSource accesses the data in the source system directly. This is best performed
using a data reconciliation DataSource delivered by business intelligence Content or a generic
DataSource using function modules because this allows you to implement an aggregation logic.
For mass data, you generally need to aggregate the data or make appropriate selections during
extraction.
The data reconciliation DataSource has to provide selection fields that allow the same set of data
to be extracted as the productive DataSource.
Task “The source systems for a data warehouse are typically transaction processing
applications.” Discuss
5.3 Data aggregation and customization
Data aggregation is any process in which information is gathered and expressed in a summary
form, for purposes such as statistical analysis. A common aggregation purpose is to get more
information about particular groups based on specific variables such as age, profession, or
income. The information about such groups can then be used for Web site personalization to
choose content and advertising likely to appeal to an individual belonging to one or more groups
for which data has been collected.
Example: A site that sells music CDs might advertise certain CDs based on the age of the
user and the data aggregate for their age group. Online analytic processing (OLAP) is a simple
type of data aggregation in which the marketer uses an online reporting mechanism to process
the information.
Data aggregation can be user-based: personal data aggregation services offer the user a single
point for collection of their personal information from other Web sites. The customer uses a
single master personal identification number (PIN) to give them access to their various accounts
(such as those for financial institutions, airlines, book and music clubs, and so on). Performing
this type of data aggregation is sometimes referred to as “screen scraping.”
92 LoveLy professionaL university