Page 105 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 105

Information Storage and Retrieval



                   Notes

                                               In binary classification, recall is called sensitivity. So it can be looked at as the
                                               probability that a relevant document is retrieved by the query.
                                 It is trivial to achieve recall of 100% by returning all documents in response to any query. Therefore,
                                 recall alone is not enough but one needs to measure the number of non-relevant documents also,
                                 for example by computing the precision.

                                 Precision and recall are the basic measures used in evaluating search strategies

                                 As shown in the first two figures on the left, these measures assume:
                                  1.   There is a set of records in the database which is relevant to the search topic.
                                  2.   Records are assumed to be either relevant or irrelevant (these measures do not allow for
                                       degrees of relevancy).

                                                                     Figure 10.1






                                                                                    The set of
                                                                                      items
                                                                                    retrieved






                                                     The set of relevant
                                                     items in the database


                                                                     Figure 10.2





                                                                                    Irrelevant
                                                                                      items-
                                                                                    retrieved



                                                                   Relevant items-retrived
                                                         Relevant items-not retrived



                                  3.   The actual retrieval set may not perfectly match the set of relevant records.
                                 RECALL is the ratio of the number of relevant records retrieved to the total number of relevant
                                 records in the database. It is usually expressed as a percentage.






            100                              LOVELY PROFESSIONAL UNIVERSITY
   100   101   102   103   104   105   106   107   108   109   110