Page 65 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 65
Unit 4: Data Mining Classification
notes
Figure 4.1: The Data Classification Process
(a)
(b)
4.1 What is Classification and Prediction?
4.1.1 Classification
Classification is a data mining technique used to predict group membership for data instances.
Example: You may wish to use classification to predict whether the weather on a
particular day will be “sunny”, “rainy” or “cloudy”. Popular classification techniques include
decision trees and neural networks.
In classification, there is a target categorical variable, such as income bracket, which, for example,
could be partitioned into three classes or categories: high income, middle income, and low income.
The data mining model examines a large set of records, each record containing information on
the target variable as well as a set of input or predictor variables. For example, consider the
excerpt from a data set shown in Table 4.1.
LoveLy professionaL university 59