Page 65 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 65

Unit 4: Data Mining Classification




                                                                                                notes
                                 Figure 4.1: The Data Classification Process























                                               (a)
















                                               (b)


          4.1 What is Classification and Prediction?


          4.1.1 Classification

          Classification is a data mining technique used to predict group membership for data instances.


                 Example: You  may  wish  to  use  classification  to  predict  whether  the  weather  on  a
          particular day will be “sunny”, “rainy” or “cloudy”. Popular classification techniques include
          decision trees and neural networks.
          In classification, there is a target categorical variable, such as income bracket, which, for example,
          could be partitioned into three classes or categories: high income, middle income, and low income.
          The data mining model examines a large set of records, each record containing information on
          the target variable as well as a set of input or predictor variables. For example, consider the
          excerpt from a data set shown in Table 4.1.











                                           LoveLy professionaL university                                    59
   60   61   62   63   64   65   66   67   68   69   70