Page 76 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 76
Data Warehousing and Data Mining
notes Method
1. Create a node N;
2. If samples are all of the same class, C then
3. Return N as a leaf node labeled with the class C
4. If attribute-list is empty then
5. Return N as a leaf node labeled with the most common class in samples; // majority
voting
6. Select test-attribute, the attribute among attribute-list with the highest information gain
7. Label node N with test-attribute
8. For each known value a of test-attribute // partition the samples
i
9. Grow a branch from node N for the condition test-attribute=a
i
10. Let s be the set of samples in samples for which test-attribute=a ; // a partition
i
i
11. If s is empty then
i
12. Attach a leaf labeled with the most common class in samples
13. Else attach the node returned by Generate decision tree (s , attribute-list - test-attribute).
i
4.7.2 Decision tree induction
The automatic generation of decision rules from examples is known as rule induction or automatic
rule induction.
Generating decision rules in the implicit form of a decision tree are also often called rule induction,
but the terms tree induction or decision tree inductions are sometimes preferred.
The basic algorithm for decision tree induction is a greedy algorithm, which constructs decision
trees in a top-down recursive divide-and-conquer manner. The basic algorithm for learning
decision trees, is a version of ID3, a well-known decision tree induction algorithm.
The basic strategy is as follows:
1. The tree starts as a single node representing the training samples (step 1).
2. If the samples are all of the same class, then the node becomes a leaf and is labeled with that
class (steps 2 and 3).
3. Otherwise, the algorithm uses an entropy-based measure known as information gain as a
heuristic for selecting the attribute that will best separate the samples into individual classes
(step 6). This attribute becomes the “test” or “decision” attribute at the node (step 7). In this
version of the algorithm, all attributes are categorical, i.e., discrete-valued. Continuous-
valued attributes must be discretized.
4. A branch is created for each known value of the test attribute, and the samples are
partitioned accordingly (steps 8-10).
5. The algorithm uses the same process recursively to form a decision tree for the samples at
each partition. Once an attribute has occurred at a node, it need not be considered in any of
the node’s descendents (step 13).
70 LoveLy professionaL university