Page 121 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 121

Information Analysis and Repackaging



                   Notes              • Binary Independence Model

                                      • Probabilistic relevance model on which is based the okapi (BM25) relevance function
                                      • Uncertain inference
                                      • Language models
                                      • Divergence-from-randomness model
                                      • Latent Dirichlet allocation
                                     Feature-based retrieval models view documents as vectors of values of feature functions (or
                                      just features) and seek the best way to combine these features into a single relevance score,
                                      typically bylearning to rank methods. Feature functions are arbitrary functions of document
                                      and query, and as such can easily incorporate almost any other retrieval model as just a yet
                                      another feature.
                                 Second dimension: properties of the model
                                     Models without term-interdependencies treat different terms/words as independent. This
                                      fact is usually represented in vector space models by the orthogonality assumption of term
                                      vectors or in probabilistic models by an independency assumption for term variables.
                                     Models with immanent term interdependencies allow a representation of interdependencies
                                      between terms. However, the degree of the interdependency between two terms is defined by
                                      the model itself. It is usually directly or indirectly derived (e.g., by dimensional reduction)
                                      from the co-occurrence of those terms in the whole set of documents.
                                     Models with transcendent term interdependencies allow a representation of interdependen-
                                      cies between terms, but they do not allege how the interdependency between two terms is
                                      defined. They relay an external source for the degree of interdependency between two terms.
                                      (For example, a human or sophisticated algorithms).

                                 Self Assessment

                                 Multiple Choice Questions:
                                  1.   Automated information retrieval systems are used to reduce ......
                                        (a)  digital obsolescence     (b) information overload
                                        (c)  information need
                                  2.   ...... is the fraction of the documents retrieved that are relevant to the user’s information
                                       need.
                                        (a)  recall                   (b) precision           (c) fall-out
                                  3.   ...... is the fraction of documents that are relevant to the query that are successfully retrieved.
                                        (a)  recall                   (b) F-measure           (c) fall-out.
                                  4.   The proportion of non-relevant decuments that are retrieved, out of all non-relevant
                                       documents available is known as ......
                                        (a)  recall                   (b) F-measure           (c) fall-out.

                                 6.3 Search Strategies

                                 Search strategies are comprehensive plans for finding information — includes defining the information
                                 need, and determining the form in which it is needed, if it exists, where it is located, how it is organized,
                                 and how to retrieve it.
                                 Advances in technologies and in particular the high volume of content accessible through the Internet,
                                 has led to an explosion of information available on a global scale.




            116                              LOVELY PROFESSIONAL UNIVERSITY
   116   117   118   119   120   121   122   123   124   125   126