Page 92 - DECO504_STATISTICAL_METHODS_IN_ECONOMICS_ENGLISH
P. 92

Statistical Methods in Economics


                   Notes          The value v of the P-th percentile may now be calculated as follows:
                                  If P < p  or P > p , then we take v = v  or v = v , respectively.
                                       1       N               1      N
                                  If there is some integer k for which P = p , then we take v = v .
                                                                  k               k
                                                                                                P  − p
                                  Otherwise, we find the integer k for which p  < P < p   , and take v =  v  k  ( +  v  +  v k  =
                                                                                                              ) k+1
                                                                      k      k + 1          k  p k+1  − p k
                                         P  − p k
                                   k  +v      ( × N  +1  − v  v k  ) k  .
                                          100
                                                                th
                                  Using the list of numbers above, the 40  percentile would be found by linearly interpolating between
                                  the 30  percentile, 20, and the 50 , 35. Specifically:
                                                            th
                                       th
                                                                      40  − 30
                                                             v = 20  +×     (  −  )35 20  = 27.5
                                                                   5
                                                                       100
                                  This is halfway between 20 and 35, which one would expect since the rank was calculated above as
                                  2.5.
                                  It is readily confirmed that the 50  percentile of any list of values according to this definition of the P-
                                                            th
                                  th percentile is just the sample median.
                                  Moreover, when N is even the 25  percentile according to this definition of the P-th percentile is the
                                                            th
                                                 N
                                  median of the first    values (i.e., the median of the lower half of the data).
                                                  2
                                  Weighted percentile
                                  In addition to the percentile function, there is also a weighted percentile, where the percentage in the
                                  total weight is counted instead of the total number. There is no standard function for a weighted
                                  percentile. One method extends the above approach in a natural way.
                                  Suppose we have positive weights w , w , w , ..., w  associated, respectively, with our N sorted sample
                                                                 2
                                                                    3
                                                                        N
                                                              1
                                  values. Let
                                                                 n
                                                            S =  ∑  w k ,
                                                             n
                                                                k= 1
                                  the n-th partial sum of the weights. Then the formulas above are generalized by taking
                                                                100 ⎛   w  ⎞
                                                            p =    ⎜  S  −  n  ⎟ n
                                                             n   S N  ⎝  2  ⎠
                                  and
                                                                       −
                                                                     pp  k
                                                                v
                                                             v =  k        ( +  v  +  v k .
                                                                                   ) k+1
                                                                    p k+1  − p k
                                  Alternative methods
                                  Some software packages, including Microsoft Excel (up to the version 2007) use the following method,
                                  noted as an alternative by NIST to estimate the value, v , of the P-th percentile of an ascending
                                                                                P
                                  ordered dataset containing N elements with values  v 1  ≤  2  ≤ v   . . .  ≤ v .
                                                                                         N
                                  The rank is calculated:
                                                                 P
                                                            n =    (  −  )N1  +  1
                                                                100




         86                               LOVELY PROFESSIONAL UNIVERSITY
   87   88   89   90   91   92   93   94   95   96   97