Page 108 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 108

Data Warehousing and Data Mining




                    notes            3.   Enabled better understanding of customer behavior and customer profiles
                                     4.   Helped customer increase its market share and understand competition better


                                   5.9 summary


                                   l z  In  the  process  of  extracting  data  from  one  source  and  then  transforming  the  data  and
                                       loading it to the next layer, the whole nature of the data can change considerably.
                                   l z  It  might  also  happen  that  some  information  is  lost  while  transforming  the  data.  A
                                       reconciliation process helps to identify such loss of information.
                                   l z  One of the major reasons of information loss is loading failures or errors during loading.
                                   l z  Data reconciliation is often confused with the process of data quality testing. Even worse,
                                       sometimes data reconciliation process is used to investigate and pin point the data issues.

                                   l z  While data reconciliation may be a part of data quality assurance, these two things are not
                                       necessarily same.

                                   l z  Scope of data reconciliation should be limited to identify, if at all, there is any issue in the
                                       data or not.
                                   l z  The scope should not be extended to automate the process of data investigation and pin
                                       pointing the issues.

                                   5.10 keywords


                                   Data  Aggregation:  Data  aggregation  is  any  process  in  which  information  is  gathered  and
                                   expressed in a summary form, for purposes such as statistical analysis.
                                   Data Extraction: Extraction is the operation of extracting data from a source system for further
                                   use in a data warehouse environment.
                                   Data Quality: Data quality has been defined as the fraction of performance over expectancy, or
                                   as the loss imparted to society from the time a product is shipped.
                                   Update Propagation: Data Propagation is the distribution of data from one or more source data
                                   warehouses to one or more local access databases, according to propagation rules.

                                   5.11 self assessment

                                   Choose the appropriate answers:

                                   1.   SQL stands for:
                                       (a)   Structured Query Language
                                       (b)   Structured Query League
                                       (c)   Systematic Query Language
                                       (d)   Structured Queue Language

                                   2.   PIN stands for:
                                       (a)   Permanent Identification Number
                                       (b)   Permanent Index Number
                                       (c)   Personal Identification Number

                                       (d)   Personal Index Number


          102                              LoveLy professionaL university
   103   104   105   106   107   108   109   110   111   112   113