Page 175 - DMGT404 RESEARCH_METHODOLOGY
P. 175
Unit 9: Correlation and Regression
Similarly, when correlation is negative, the points will tend to concentrate more in II and Notes
IV quadrants than in I and III. Thus, the sum of negative products of deviations will
outweigh the sum of positive products and hence (X i – )(Y i – ) will be negative for all
X
Y
the n observations.
Further, if there is no correlation, the sum of positive products of deviations will be equal
to the sum of negative products of deviations such that (X i – )(Y i – ) will be equal to
X
Y
zero.
On the basis of the above, we can consider (X i – )(Y i – ) as an absolute measure of correlation.
Y
X
This measure, like other absolute measures of dispersion, skewness, etc., will depend upon (i)
the number of observations and (ii) the units of measurements of the variables.
In order to avoid its dependence on the number of observations, we take its average, i.e.,
1 (X – )(Y – ) . This term is called covariance in statistics and is denoted as Cov(X, Y).
X
Y
n i i
To eliminate the effect of units of measurement of the variables, the covariance term is divided
by the product of the standard deviation of X and the standard deviation of Y. The resulting
expression is known as the Karl Pearson’s coefficient of linear correlation or the product moment
correlation coefficient or simply the coefficient of correlation, between X and Y.
Cov ( ,X Y )
r = ...(1)
XY s s Y
X
1 å (X - X Y - Y )
n i )( i
or r = ...(2)
XY 1 å (X - X ) 2 1 å Y - Y ) 2
n i n ( i
1
Cancelling from the numerator and the denominator, we get
n
å (X - X Y - Y )
r = i )( i ...(3)
å (X - X ) å ( i Y )
XY 2 2
Y -
i
å
Consider (X - X Y - Y ) å (X - X )Y - å (X - X )
Y
i
i
i
)( i
i
= X Y - X Y (second term is zero)
i i i
= å X Y - nXY (å Y nY )
i
i
i
å
2
Similarly we can write (X - X ) å X - nX 2
2
i
i
å Y - ) å Y - 2
2
2
and ( i Y i nY
Substituting these values in equation (3), we have
å X Y - nXY
r = i i ...(4)
û ëå
ëå X - nX ù é Y - nY ù û
XY é 2 2 2 2
i
i
LOVELY PROFESSIONAL UNIVERSITY 169