Page 181 - DMGT404 RESEARCH_METHODOLOGY
P. 181
Unit 9: Correlation and Regression
Notes
1 2 1 2 2
y
or å ' x + å ' y - å x ' ' ³ 0
i
i
i
i
n n n
or 1 + 1 – 2r ³ 0 or 2 – 2r ³ 0 or r £ 1 .... (12)
Combining the inequalities (11) and (12), we get – 1 r 1. Hence r lies between – 1
and + 1.
3. If X and Y are independent they are uncorrelated, but the converse is not true.
If X and Y are independent, it implies that they do not reveal any tendency of simultaneous
movement either in same or in opposite directions. The dots of the scatter diagram will be
å
uniformly spread in all the four quadrants. Therefore, (X - X Y - Y ) or Cov(X, Y)
i
)( i
will be equal to zero and hence, r = 0. Thus, if X and Y are independent, they are
XY
uncorrelated.
The converse of this property implies that if r = 0, then X and Y may not necessarily be
XY
independent. To prove this, we consider the following data:
X 1 2 3 4 5 6 7
Y 9 4 1 0 1 4 9
Here X = 28, Y = 28 and X Y = 112.
i i i i
)
é (å X )(å Y ù 1é 28 28ù
´
1
)
ê
Cov ( ,X Y = å X Y - i i ú = ê 112- ú = 0 Thus, r = 0
n ê i i n ú 7 ë 7 û XY
ë û
A close examination of the given data would reveal that although r = 0, but X and Y are
XY
2
not independent. In fact they are related by the mathematical relation Y = (X – 4) .
!
Caution r is only a measure of the degree of linear association between X and Y. If the
XY
association is non-linear, the computed value of r is no longer a measure of the degree
XY
of association between the two variables.
9.1.4 Merits and Limitations of Coefficient of Correlation
The only merit of Karl Pearson’s coefficient of correlation is that it is the most popular method
for expressing the degree and direction of linear association between the two variables in terms
of a pure number, independent of units of the variables. This measure, however, suffers from
certain limitations, given below:
1. Coefficient of correlation r does not give any idea about the existence of cause and effect
relationship between the variables. It is possible that a high value of r is obtained although
none of them seem to be directly affecting the other. Hence, any interpretation of r should
be done very carefully.
2. It is only a measure of the degree of linear relationship between two variables. If the
relationship is not linear, the calculation of r does not have any meaning.
3. Its value is unduly affected by extreme items.
LOVELY PROFESSIONAL UNIVERSITY 175