Page 107 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 107
Information Storage and Retrieval
Notes Other problems with Precision and Recall:
As noted earlier, records must be considered either relevant or irrelevant when calculating precision
and recall. Obviously records can exist which are marginally relevant or somewhat irrelevant. Others
may be very relevant and others completely irrelevant. This problem is complicated by individual
perception: what is relevant to one person may not be relevant to another.Measuring recall is difficult
because it is often difficult to know how many relevant records exist in a database. Often recall is
estimated by identifying a pool of relevant records and then determining what proportion of the
pool the search retrieved. There are several ways of creating a pool of relevant records: one method
is to use all the relevant records found from different searches, another is to manually scan several
journals to identify a set of relevant papers.
Precision and Recall are useful measures despite their limitations:
As abstract ideas, recall and precision are invaluable to the experienced searcher. Knowing the goal
of the search — to find everything on a topic, just a few relevant papers, or something in-between —
determines what strategies the searcher will use. There are a variety of search techniques which
may be used to effect the level recall and precision. A good searcher must be adept at using them.
Many of these techniques are discussed in the section on search strategies.
Self Assessment
Fill in the blanks:
3. In binary classification, recall is called ...... .
4. ...... is the ratio of the number of relevant records retrieved to the total number of relevant
records in the database.
5. ...... is the ratio of the number of relevant records retrieved to the total number of irrelevant
and relevant records retrieved.
6. Records must be considered either relevant or irrelevant when calculating ...... and ...... .
7. As abstract ideas, recall and precision are invaluable to the ...... .
10.5 Relevance
In information science and information retrieval, relevance denotes how well a retrieved document
or set of documents meets the information need of the user.
Types
Relevance most commonly refers to topical relevance or aboutness, i.e., to what extent the topic of a
result matches the topic of the query or information need. Relevance can also be interpreted more
broadly, referring to generally how “good” a retrieved result is with regard to the information need.
The latter definition of relevance, sometimes referred to as user relevance, encompasses topical
relevance and possibly other concerns of the user such as timeliness, authority or novelty of the
result.
History
The concern with the problem of finding relevant information dates back at least to the first publication
of scientific journals in 17th Century.The formal study of relevance began in the 20th Century with
102 LOVELY PROFESSIONAL UNIVERSITY