Page 243 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 243

Unit 12: Metadata and Warehouse Quality




          support” (Inmon, 1996). In the “Data Warehouse Toolkit”, Ralph Kimball gives a more concise   notes
          definition: “a copy of transaction data specifically structured for query and analysis” (Kimball,
          1998). Both definitions stress the data warehouse’s analysis focus, and highlight the historical
          nature of the data found in a data warehouse.
                                  figure 12.8: Data Warehousing structure


































          stages of Data Warehousing susceptible to Data Quality problems

          The purpose of paper here is to formulate a descriptive taxonomy of all the issues at all the stages
          of Data Warehousing. The phases are:

          1.   Data Source
          2.   Data Integration and Data Profiling
          3.   Data Staging and ETL
          4.   Database Scheme (Modeling)

          Quality of data can be compromised depending upon how data is received, entered, integrated,
          maintained, processed (Extracted, Transformed and Cleansed) and loaded. Data is impacted by
          numerous processes that bring data into your data environment, most of which affect its quality
          to some extent. All these phases of data warehousing are responsible for data quality in the data
          warehouse. Despite all the efforts, there still exists a certain percentage of dirty data. This residual
          dirty data should be reported, stating the reasons for the failure in data cleansing for the same.
          Data quality problems can occur in many different ways. The most common include:
          1.   Poor data handling procedures and processes.
          2.   Failure to stick on to data entry and maintenance procedures.

          3.   Errors in the migration process from one system to another.
          4.   External and third-party data that may not fit with your company data standards or may
               otherwise be of unconvinced quality.



                                           LoveLy professionaL university                                   237
   238   239   240   241   242   243   244   245   246   247   248