Page 139 - DECO504_STATISTICAL_METHODS_IN_ECONOMICS_ENGLISH
P. 139
Unit 9: Correlation: Definition, Types and its Application for Economists
supported by reference to the following chart showing the yearly average price of consols and Notes
Sauerbeck’s index numbers of prices.
The foregoing illustrations show the need by economists of a quantitative measure of correlation.
Such a measure has been widely used in biological statistics and used to a limited extent in economic
statistics. G.U. Yule has used the measure in his study of “Pauperism ;” R.H. Hooker has used it in
his “Correlation of the Weather and Crops;” J. P. Norton applied it in his study of the “New York
Money Market.” This measure, the coefficient of correlation, will be computed for the data upon
which the conclusions quoted above are based. The formula for the coefficient of correlation is
∑xy
r = ;
n σ σ
12
where:
−
x = deviation from arithmetic mean = XM 1
−
y = deviation from arithmetic mean= YM 2
σ = standard deviation of X series
1
σ = standard deviation of Y series
2
n = number of items.
The coefficient of correlation “serves as a measure of any statement involving two qualifying adjectives,
which can be measured numerically, such as tall men have tall sons,’ ‘wet springs bring dry summers,’
‘short hours go with high wages.’ “ It is not the purpose in what follows to go through the mathematical
derivation of the coefficient of correlation, but to test the formula empirically in order to ascertain
how it actually varies for given series of statistics and to point out some of its features.
However, it should be noted at this point that the coefficient of correlation is not empirical but was
derived by a priori reasoning. It was found by assuming that a large number of independent causes
operate upon each of the two series X and Y, producing normal distributions in both cases. Upon the
assumption that the set of causes operating upon the series X is not independent of the set of causes
∑xy
operating upon the series Y the value r = is obtained. This value becomes zero when the
n σ σ
12
operating causes are absolutely independent. Hence the value of r was taken as a measure of
correlation. In what follows no assumption concerning the type of distribution of the X and Y series will be
made.
Some appreciation of the meaning of the coefficient of correlation can be obtained by the consideration
of a few simple applications. Suppose that we consider the two series of measurements:
X = l, 2, 3, 4, 5 M 1 = 3
Y = 6, 8, 10, 12, 14 M 2 = 10
Deviations. Square of Deviation. Product of Deviations.
x y x 2 y 2 xy
– 2 – 4 4 16 8 σ = 2
1
– 1 – 2 1 4 2
0 0 0 0 0 σ = 22
2
+ 1 + 2 1 4 2 r = 20 = 1
52.2 2
+ 2 + 4 4 16 8
LOVELY PROFESSIONAL UNIVERSITY 133