Page 151 - DECO504_STATISTICAL_METHODS_IN_ECONOMICS_ENGLISH
P. 151

Unit 9: Correlation: Definition, Types and its Application for Economists


            •   Report of the United States Bureau of Labor, “the percentage of successful strikes decreases during  Notes
                periods of business prosperity and increases during ‘hard times.’ “ In the accompanying charts
                the per cent. of establishments in which strikes were successful is plotted, first, with the per capita
                exports and imports and second, with index numbers of wholesale prices. The foreign trade and
                the price statistics are taken as indicative of the activity of business, as indices of prosperity.
            •   The coefficient of correlation “serves as a measure of any statement involving two qualifying
                adjectives, which can be measured numerically, such as tall men have tall sons,’ ‘wet springs
                bring dry summers,’ ‘short hours go with high wages.’ “ It is not the purpose in what follows to
                go through the mathematical derivation of the coefficient of correlation, but to test the formula
                empirically in order to ascertain how it actually varies for given series of statistics and to point
                out some of its features.
            •   The correlation coefficients show that there is a very great difference in the degree of correlation
                of different pairs of series of statistics. The full significance of the “probable error,” which is
                used as a measure of unreliability of any determination, cannot be developed at this point. It is
                sufficient to note that, “When r is not greater than its probable error we have no evidence that
                there is any correlation, for the observed phenomena might easily arise from totally unconnected
                causes; but, when r is greater than, say, six times its probable error, we may be practically
                certain that the phenomena are not independent of each other, for the chance that the observed
                results would be obtained from unconnected causes is practically zero.”
            •   The amount of correlation indicated in each case is small—considering the number of years
                taken, so small that no conclusion as to the connection between the two series can be drawn.
                The correlation coefficient in the last instance, i. e., between per cent. of successful strikes and
                business distrust, suggests an opposite conclusion to that indicated by the other coefficients
                and that of Mr. Cross. The analysis shows that the conclusion that there is negative correlation
                between general prosperity and per cent. of successful strikes is not warranted.
            •   The coefficient for the two series, population and bank reserves, came out to be 0.98. This high
                coefficient comes from the fact that the long-time variation of both series is the same.
                Consequently, before it is legitimate to draw any conclusions as to the meaning of a lack of
                correlation, or amount of correlation between two series of measurements it is necessary to
                ascertain the periodic and the secular variations in the two series. This correlation coefficient
                may be large through the correspondence of either secular or periodic variation, or both. It may
                be null because one variation covers up the other.
            •   For a stationary price the production must increase 46 million bushels per year.
                It seemed to me that if percentage changes in price and production were used instead of absolute
                changes a still closer correlation might result. The computation of ρ  from such percentages,
                however, gave – 0.794.
            •   In the preceding illustrations the amount of correlation between the differences was greater
                than that between the original series. The method of differences has also been used by the
                writer for Kemmerer’s statistics (considered on page 15 of this article) of (1) money in circulation,
                and (2) bank reserves for the period 1879 – 1904 with the result ρ  = + 0.392, whereas the value
                of r is 0.98. This shows that there is a lack of correspondence of the short-time variations in
                these two series.
            •   Mr. G. U. Yule, in the paper already referred to,* has worked out the general solution of the
                problem of the correlation between three variables. In the course of the solution the problem
                just considered is solved incidentally. The argument is similar to that used in the case of two
                variables and so it will not be repeated here. A concrete notion of the results secured by Mr.
                Yule can be obtained from the following explanation taken from Mr. Hooker’s article on the
                “Correlation of the Weather and the Crops.”
            •   The ordinary graphic method of measuring correlation is inadequate. The coefficient of
                correlation is simple and yet is sensitive to small changes. It has been used in many fields of
                statistics by Galton, Pearson, Yule, Hooker, Elderton and others. The experience of these writers




                                             LOVELY PROFESSIONAL UNIVERSITY                                      145
   146   147   148   149   150   151   152   153   154   155   156