Page 80 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 80

Data Warehousing and Data Mining




                    notes          6.   Path to terminal node 12 - the company prepare the bid, make the short-list but their bid of
                                       £170K is unsuccessful
                                       Total cost = 10 + 5 Total profit = –15
                                   7.   Path to terminal node 13 - the company prepare the bid, make the short-list and their bid
                                       of £190K is accepted

                                       Total cost = 10 + 5 + 127 Total revenue = 190 Total profit = 48
                                   8.   Path to terminal node 14 - the company prepare the bid, make the short-list but their bid of
                                       £190K is unsuccessful

                                       Total cost = 10 + 5 Total profit = –15
                                   9.   Path to terminal node 15 - the company prepare the bid and make the short-list and then
                                       decide to abandon bidding (an implicit option available to the company)
                                       Total cost = 10 + 5 Total profit = –15

                                   Hence we can arrive at the table below indicating for each branch the total profit involved in that
                                   branch from the initial node to the terminal node.

                                                           terminal node   Total profit £
                                                                 7              0
                                                                 8             –10
                                                                 9             13
                                                                 10            –15
                                                                 11            28
                                                                 11            –15
                                                                 13            48
                                                                 14            –15
                                                                 15            –15
                                   We can now carry out the second step of the decision tree solution procedure where we work
                                   from the right-hand side of the diagram back to the left-hand side.



                                      Task     “Different distance functions have different characteristics, which fit various
                                     types of data.” Explain


                                   4.7.4 Extracting Classification Rules from Decision Trees

                                   Even  though  the  pruned  trees  are  more  compact  than  the  originals,  they  can  still  be  very
                                   complex. Large decision trees are difficult to understand because each node has a specific context
                                   established by the outcomes of tests at antecedent nodes. To make a decision-tree model more
                                   readable, a path to each leaf can be transformed into an IF-THEN production rule. The IF part
                                   consists of all tests on a path, and the THEN part is a final classification. Rules in this form are
                                   called decision rules, and a collection of decision rules for all leaf nodes would classify samples
                                   exactly as the tree does. As a consequence of their tree origin, the IF parts of the rules would be
                                   mutually exclusive and exhaustive, so the order of the rules would not matter. An example of
                                   the transformation of a decision tree into a set of decision rules is given in Figure 4.6, where the
                                   two given attributes, A and B, may have two possible values, 1 and 2, and the final classification
                                   is into one of two classes.




          74                               LoveLy professionaL university
   75   76   77   78   79   80   81   82   83   84   85