Page 202 - DMGT404 RESEARCH_METHODOLOGY
P. 202

Research Methodology




                    Notes          A Linear Model

                                   A linear model for the above data is
                                                                   ˆ y  = – 37 + 5.1x

                                   The hat on the  indicates that  is estimated from the data. The figure on the right shows a plot of
                                   this function: a line giving the predicted  versus x, with the original values of y shown as red
                                   dots.

                                   The data at the extremes of x indicates that the relationship between y and x may be non-linear
                                   (look at the red dots relative to the regression line at low and high values of x). We thus turn to
                                   MARS  to automatically  build a model taking  into account  non-linearities. MARS  software
                                   constructs a model from the given x and y as follows:

                                          ˆ y  = 25
                                          + 6.1 max (0, x – 13)
                                          – 3.1 max (0, 13 – x)

                                                                     Figure  9.9





















                                   A Simple MARS Model of the Same Data

                                   Figure 9.10 shows a plot of this function: the predicted  versus x, with the original values of y
                                   once again shown as red dots. The predicted response is now a better fit to the original y values.
                                   MARS has automatically produced a kink in the predicted y to take into account non-linearity.
                                   The kink is produced by hinge functions. The hinge functions are the expressions starting with
                                   max (where max(a, b) is a if a > b, else b). Hinge functions are described in more detail below.
                                   In this simple example, we can easily see from the plot that the y has a non-linear relationship
                                   with x (and might perhaps guess that y varies with the square of x). However, in general there
                                   will be multiple independent variables, and the relationship between y and these variables will
                                   be unclear and not easily visible by  plotting. We can use  MARS to discover that non-linear
                                   relationship.

                                   An example MARS expression with multiple variables is ozone = 5.2
                                          + 0.93 max(0, temp – 58)
                                          – 0.64 max(0, temp – 68)





          196                               LOVELY PROFESSIONAL UNIVERSITY
   197   198   199   200   201   202   203   204   205   206   207