Page 167 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 167

Unit 8: Data Warehouse Refreshment




             The data cleansing process is also helping reduce marketing costs. Banco Popular will use   notes
             the Trillium Software System® to enforce standardization and cleanse data. This will be
             the key to an accurate “householding” process, which is a way of identifying how many
             account holders live at the same address.
             By doing this, the bank can eliminate duplicate mailings to the same household, which
             makes the bank look much more efficient in its customers’ eyes, and saves at least $70,000
             in mailing expenses every month. Banco Popular’s home-grown address standardization
             system will soon be replaced by Trillium Software’s geocoding solution. This will save the
             cost of changing and recertifying the system each time the US Postal Service changes its
             standardization requirements.
             DB2  can  easily  handle  customer  information  systems  containing  millions  of  records  in
             multiple languages that are initially cleansed with the Trillium Software System. Not only
             is Banco Popular expanding, its customers may be represented within complex financial
             records on the database in either English or Spanish. Trillium Software scales in step with
             the growing DB2 database and works in numerous languages to provide a global solution
             for this multinational bank.


          8.4 summary

          l z  DWH  refreshment  so  far  has  been  investigated  in  the  research  community  mainly  in
               relation to techniques for maintaining materialized views.

          l z  In these approaches, the DWH is considered as a set of materialized views defined over
               operational  data.  Thus,  the  topic  of  warehouse  refreshment  is  defined  as  a  problem  of
               updating a set of views (the DWH) as a result of modifications of base relations (residing in
               operational systems). Several issues have been investigated in this context.

          l z  The extraction method you should choose is highly dependent on the source system and
               also from the business needs in the target data warehouse environment.
          l z  Very often, there is no possibility to add additional logic to the source systems to enhance
               an incremental extraction of data due to the performance or the increased workload of
               these systems.
          l z  Sometimes  even  the  customer  is  not  allowed  to  add  anything  to  an  out-of-the-box
               application system.

          8.5 keywords


          Corporate Data Store: The corporate data store can be complemented by an Operational Data
          Store (ODS) which groups the base data collected and integrated from the sources.
          Data Cleaning: Data cleaning can be applied to remove noise and correct inconsistencies in the
          data.
          Incremental Data Extraction: Incremental data extraction can be implemented depends on the
          characteristics of the data sources and also on the desired functionality of the data warehouse
          system.
          The Design Phase: The design phase consists of the definition of user views, auxiliary views,
          source extractors, data cleaners, data integrators.










                                           LoveLy professionaL university                                   161
   162   163   164   165   166   167   168   169   170   171   172