Page 119 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 119

Information Analysis and Repackaging



                   Notes         R-Precision

                                 Precision at R-th position in the ranking of results for a query that has R relevant documents. This
                                 measure is highly correlated to Average Precision.

                                 Mean average precision
                                 Mean average precision for a set of queries is the mean of the average precision scores for each query.
                                                               Q
                                                              ∑   AveP( q)
                                                              q = 1
                                                       MAP =
                                                                  Q
                                 where Q is the number of queries.

                                 Discounted cumulative gain
                                 DCG uses a graded relevance scale of documents from the result set to evaluate the usefulness, or
                                 gain, of a document based on its position in the result list. The premise of DCG is that highly relevant
                                 documents appearing lower in a search result list should be penalized as the graded relevance value
                                 is reduced logarithmically proportional to the position of the result.
                                 The DCG accumulated at a particular rank position p is defined as:
                                                                     p   rel
                                                       DCG p  =  rel +  1 ∑  log  i  i
                                                                    i = 2  2

                                 Since result set may vary in size among different queries or systems, to compare performances the
                                 normalised version of DCG uses an ideal DCG. To this end, it sorts documents of a result list by
                                 relevance, producing an ideal DCG at position p (IDCG ), which normalizes the score:
                                                                              p
                                                                DCG
                                                       nDCG   =      p
                                                             p
                                                                IDCG p
                                 The nDCG values for all queries can be averaged to obtain a measure of the average performance of
                                 a ranking algorithm. Note that in a perfect ranking algorithm, the DCG  will be the same as theIDCG p
                                                                                         p
                                 producing an nDCG of 1.0. All nDCG calculations are then relative values on the interva l 0.0 to 1.0
                                 and so are cross-query comparable.
                                 This model has been very productive and has promoted our understanding of information retrieval
                                 in many ways. However, as Kuhn noted, major models that are as central to a field as this one is,
                                 eventually begin to show inadequacies as testing leads to greater and greater understanding of the
                                 processes being studied. The limitations of the original model’s representation of the phenomenon
                                 of interest become more and more evident.
                                  It is only fitting, then, that in recent years the above classic model has come under attack in various
                                 ways. Oddy and Belkin et al. have asked why it is necessary for the searcher to find a way to represent
                                 the information need in a query understandable by the system. Why cannot the system make it
                                 possible for the searcher to express the need directly as they would ordinarily, instead of in an
                                 artificial query representation for the system’s consumption?










            114                              LOVELY PROFESSIONAL UNIVERSITY
   114   115   116   117   118   119   120   121   122   123   124