Page 120 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 120
Unit 6: Information Retrieval Model and Search Strategies
Model types Notes
Figure 6.2
Properties
of the Without With term-interdependencies
Mathe- model term-interde-
matical pendencies Immanent Transcendent
basis term-dependencies term-interdependencies
Standard
boolean Fuzzy
Set-theoretic set
Extended
boolean
Balanced
Generalised Topic-based
vector space vector space topic-based
Vector vector space
Algebraic space Spread
Latent activation Back propagation
semantic neuronal
network neuronal network
Binary
interde- Language
pendence Retrieval
Probabilistic by logical
Inference Belief imaging
network network
Categorization of IR-models (translated from German entry, original source Dominik Kuropka).
For the information retrieval to be efficient, the documents are typically transformed into a suitable
representation. There are several representations. The picture above illustrates the relationship of
some common models. In the picture, the models are categorized according to two dimensions: the
mathematical basis and the properties of the model.
First dimension: mathematical basis
Set-theoretic models represent documents as sets of words or phrases. Similarities are usually
derived from set-theoretic operations on those sets. Common models are:
• Standard Boolean model
• Extended Boolean model
• Fuzzy retrieval
Algebraic models represent documents and queries usually as vectors, matrices, or tuples.
The similarity of the query vector and document vector is represented as a scalar value.
• Vector space model
• Generalized vector space model
• (Enhanced) Topic-based Vector Space Model
• Extended Boolean model
• Latent semantic indexing aka latent semantic analysis
Probabilistic models treat the process of document retrieval as a probabilistic inference. Simi-
larities are computed as probabilities that a document is relevant for a given query. Probabi-
listic theorems like the Bayes’ theorem are often used in these models.
LOVELY PROFESSIONAL UNIVERSITY 115