Page 141 - DECO504_STATISTICAL_METHODS_IN_ECONOMICS_ENGLISH
P. 141

Unit 9: Correlation: Definition, Types and its Application for Economists


                                                                                                     Notes
                    x          y        x 2       y 2       xy

                   – 2       – 10        4       100         20

                   – 1        – 7        1        49         7            σ 1 = 1.41
                     0        –  2       0         4         0            σ 2 = 8.65

                   + 1        + 5        1        25         5             r = 0.981
                   + 2       + 14        4       196         28

                     Total ......................  10  374   60

            Although the two series increase regularly, so that deviations of like signs always correspond, yet
            the correlation is not perfect because a linear relation does not exist between X and Y.
            If the number of items in each series be increased to 11 and the Y items remain squares of the X’s the
            value of r will be 0.974.
            If there be no law connecting the X and Y series the products of the deviations (xy) are as apt to be
            negative as positive. The expression ∑xy  will therefore tend to approach zero. With a very large
            number of measurements the correlation coefficient will approximate zero.
            From the condition of no relationship to the condition of a linear relationship existing between the
            pair of series of measurements the correlation coefficient varies from 0 to ± 1.
            Suppose that we are investigating the relation existing between two series of measurements X and Y.
            Let points be plotted on cross-section paper whose coordinates are corresponding measurements X
                                                                                          1
            and Y . If there be a relationship existing between the two series, the points thus located will not lie
                1
            chaotically all over the plane, but they will range themselves about some curve or locus. This curve,
            which has been called the curve of regression, is illustrated in the accompanying diagram. The straight
            line best fitting the points is called the line of regression.
            For example suppose we consider the two series of index numbers for the period 1879-1904 inclusive,
            representing (1) money in circulation in the United States inclusive of bank reserves, and (2) bank
            reserves. Let points be located with abscissas proportionate to the money in circulation and with
            ordinates proportionate to the bank reserves of the same year. The chart on the next page shows that
            these points lie near a straight line, the line of regression.
            The coefficient of correlation (r) is a measure of the closeness of the grouping of the points about this line of
            regression. If the points should all range themselves on a line then r would equal + 1 or — 1 depending
            upon whether, looking left to right, the line sloped upward or downward.
            We will now derive the equation of the line of regression. Let X and Y be associated measurements
            and x and y be associated deviations from the respective arithmetic means. A linear relation between
            the measurements is of the form
                                   a
                               Y= 1 X + b 1
            The relation between the deviations will be of form
                               y = ax  or –y ax  = 0
                                           1
                                   1
             Since all of the points are not located exactly upon a straight line the substitution of the values  1 ,
                                                                                        x
                                                     v
                                                        v
                   y
               x
            y 1 ,  2 ,  2 , etc. in the equations will give residues  1 ,  2 , etc. as follows:
                         y 1  – ax  v
                              1 1 =  1
                         y 2  – ax  v
                             1 2 =  2
                         y n  – ax  v
                             1 n =  n

                                             LOVELY PROFESSIONAL UNIVERSITY                                      135
   136   137   138   139   140   141   142   143   144   145   146