Page 299 - DMTH404_STATISTICS
P. 299

Unit 22: Correlation



            22.3 Karl Pearson's Coefficient of Linear Correlation                                 Notes


            Let us assume, again, that we have data on two variables X and Y denoted by the pairs (X , Y ),
                                                                                    i  i
            i = 1,2, ...... n. Further, let the scatter diagram of the data be as shown in figure 22.3.
            Let X  and Y  be the arithmetic means of X and Y respectively. Draw two lines X = X  and Y = Y
            on  the scatter  diagram. These  two lines,  intersect  at  the  point  (X ,Y )  and  are  mutually
            perpendicular, divide the whole diagram into four parts, termed as I, II, III and IV quadrants, as
            shown.

                                              Figure  22.3



















            As mentioned earlier, the correlation between X and Y will be positive if low (high) values of X
            are associated with low (high) values of Y. In terms of the above figure, we can say that when
            values of X that are greater (less) than  X  are  generally associated with values of  Y that are
            greater (less) than Y , the correlation between X and Y will be positive. This implies that there
            will be a  general tendency  of points to concentrate  in I  and III  quadrants. Similarly,  when
            correlation between X and Y is negative, the point of the scatter diagram will have a general
            tendency to concentrate in II and IV quadrants.
                                                                  d           d
            Further, if we consider deviations of values from their means,  i.e.,  X - Xi and  Y - Yi, we
                                                                                i
                                                                     i
            note that:
                      d          d
            (i)  Both  X - Xi and  Y - Yi will be positive for all points in quadrant I.
                                   i
                        i
                                           d
            (ii) d X - Xi will be negative and  Y - Yi will be positive for all points in quadrant II.
                                            i
                    i
                                 d
                      d
            (iii)  Both  X - Xi and  Y - Yi will be negative for all points in quadrant III.
                                   i
                        i
                                          d
            (iv) d X - Xi will be positive and  Y - Yi will be negative for all points in quadrant IV.
                                            i
                    i
                                                                     d
                                                              d
            It is obvious from the above that the product of deviations, i.e.,  X - Xi Y - Yi  will be positive
                                                                       i
                                                                i
            for points in quadrants I and III and negative for points in quadrants II and IV.
            Since, for positive correlation, the points will tend to concentrate more in  I and III quadrants
            than in II and IV, the sum of positive products of deviations will outweigh the sum of negative
                                    
            products of deviations. Thus,   X   X  Y   Y   will be positive for all the n observations.
                                        i
                                               i
            Similarly, when correlation is negative, the points will tend to concentrate more in II and IV
            quadrants than in I and III. Thus, the sum of negative products of deviations will outweigh the
                                        
            sum of positive products and hence   X   X  Y   Y   will be negative for all the n observations.
                                                   i
                                             i
            Further, if there is no correlation, the sum of positive products of deviations will be equal to the
                                                    
            sum of negative products of deviations such that   X   X  Y  Y   will be equal to zero.
                                                        i
                                                              i
                                             LOVELY PROFESSIONAL UNIVERSITY                                  291
   294   295   296   297   298   299   300   301   302   303   304