Page 168 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 168

Data Warehousing and Data Mining




                    notes          8.6 self assessment

                                   Fill in the blanks:

                                   1.   Recent inquiries show that ..................... warehouses are becoming commonplace.
                                   2.   The refreshment of an ..................... many transactions that need to access and update a few
                                       records.
                                   3.   .....................  are  heterogeneous  and  can  include  conventional  database  systems  and
                                       nontraditional sources like flat files, XML and HTML documents.
                                   4.   ..................... is a standardized API developed by the X/Open standardization committee.
                                   5.   Replace all missing attribute values by the same constant, such as a label like .....................
                                   6.   ..................... is a random error or variance in a measured variable.

                                   7.   ..................... may be detected by clustering, where similar values are organized into groups
                                       or “clusters”.
                                   8.   Some data inconsistencies may be corrected manually using ..................... references.

                                   9.   The ..................... computes incrementally the hierarchy of aggregated views using these
                                       changes.
                                   10.   Power for loading is now measured in ..................... per hour and several companies are
                                       moving  to  parallel  architectures  when  possible  to  increase  their  processing  power  for
                                       loading and refreshment.

                                   8.7 review Questions


                                   1.   Which data you call inconsistent data? Explain with suitable example.
                                   2.   Describe data refreshment process in detail.
                                   3.   Explain loading phase of data refreshment.
                                   4.   What are the major difficulties generally face in data warehouse refreshment?
                                   5.   Describe incremental data extraction.

                                   6.   “Dirty data can cause confusion for the mining procedure.” Explain.
                                   7.   “The  refreshment  of  a  data  warehouse  is  an  important  process  which  determines  the
                                       effective usability of the data collected and aggregated from the sources.” Discuss.
                                   8.   “The period for refreshment is considered to be larger for global data warehouses.” Why

                                   9.   “Outliers may be identified through a combination of computer and human inspection.”
                                       Explain
                                   10.   “Data cleaning can be applied to remove noise and correct inconsistencies in the data”.
                                       Discuss

                                   answers: self assessment

                                   1.   100 GB                           2.   ODS involves
                                   3.   Data sources                     4.   Call Level Interface (CLI)
                                   5.  Unknown                           6.  Noise






          162                              LoveLy professionaL university
   163   164   165   166   167   168   169   170   171   172   173