Page 60 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 60

Data Warehousing and Data Mining




                    notes            3.   Scope for pilot: KL Hub (for actual data capture, UAT & roll-out), but framework
                                          must incorporate APJCC perspective
                                     4.   Define data definitions, DWH structure, data capture processes, business logics &
                                          system rules, applications & tools for Datamart.

                                     5.   Design & implement

                                   3.8 summary

                                   l z  In this unit, you learnt about the data mining technique. Data Mining is an analytic process
                                       designed to explore data (usually large amounts of data - typically business or market
                                       related) in search of consistent patterns and/or systematic relationships between variables,
                                       and  then  to  validate  the  findings  by  applying  the  detected  patterns  to  new  subsets  of
                                       data.
                                   l z  The ultimate goal of data mining is prediction - and predictive data mining is the most
                                       common type of data mining and one that has the most direct business applications.

                                   l z  The process of data mining consists of three stages: (1) the initial exploration, (2) model
                                       building or pattern identification with validation/verification, and (3) deployment (i.e., the
                                       application of the model to new data in order to generate predictions).
                                   l z  In this unit you also learnt about a statistical perspective of data mining, similarity measures,
                                       decision tree and many more.

                                   3.9 keywords


                                   Decision Tree: A decision tree is a structure that can be used to divide up a large collection
                                   of records into successively smaller sets of records by applying a sequence of simple decision
                                   rules.

                                   Dice: The dice coefficient is a generalization of the harmonic mean of the precision and recall
                                   measures.
                                   Genetic Algorithms: Genetic algorithms are mathematical procedures utilizing the process of
                                   genetic inheritance.
                                   Similarity Measures: Similarity measures provide the framework on which many data mining
                                   decision are based.
                                   3.10 self assessment


                                   Fill in the blanks:
                                   1.   ....................... is the science of learning from data.
                                   2.   ....................... are known to be crude information and not knowledge by themselves.
                                   3.   ....................... provide the framework on which many data mining decision are based.

                                   4.   The goal of ....................... systems is to meet user needs.
                                   5.   The ....................... is sometimes calculated using the max operator in place of the min.
                                   6.   ....................... are very sophisticated modeling techniques capable of modeling extremely
                                       complex functions.

                                   7.   ....................... are mathematical procedures utilizing the process of genetic inheritance.





          54                               LoveLy professionaL university
   55   56   57   58   59   60   61   62   63   64   65