Page 74 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 74
Unit 7: Sorting and Indexing
FRBR supports user tasks showing what tasks the user of a catalogue is to be able to accomplish: Notes
Find information that is similar to the user’s search criteria Identify information user wants and
eliminate information or entities user does not want Select a particular entity appropriate to user’s
needs Obtain it through loan or remotely.
FRBR comprises 3 groups of entities that key objects of interest to the users of bibliographic
information :
Group 1 represents intellectual or artistic products: Work, Expression, Manifestation, and Item.
Group 2 entities are responsible for the intellectual or artistic content: Person and Corporate body.
(responsibility relationships)
Group 3 entities are subjects of Group 1 or Group 2’s intellectual endeavor, and include Concepts,
Objects, Events, and Places.
Self Assessment
Fill in the blanks:
1. The online catalogue does not need to be stored ...... .
2. The growth of ...... and computerization add to the need for that quality.
3. The catalogue stands at the core of all ...... services.
4. At the moment we catalogue a variety of electronic resources, among them are ...... etc.
5. Recently the committee decided that the new cataloguing code will be called ...... or RDA.
7.4 Concept Indexing
Concept Indexing with WordNet Synsets
The popularity of the bag of words model is justified by the fact that words and its stems carry an
important part of the meaning of a text, specially regarding subject-based classification. However,
this representation faces two main problems: the synonymy and the polysemy of words. These
problems are addressed by a concept indexing model using WordNet synsets.
The basic idea of concept indexing with WordNet synsets is recognizing the sysnets to which words
in texts refer, and using them as terms for representation of documents in a Vector Space Model.
Synset weights in documents can be computed using the same formulas for word stem terms in the
bag of words representation. This concept based representation can improve IR, as commentedby
Gonzalo et al. “(...) using WordNet sysnets as indexing space instead of word forms (...) combines
two benefits for retrieval: one, that terms are fully disambiguated (this should improve precision);
and two, that equivalent terms can be identified (this should improve recall).”
It is important to note that available information in SemCor allows both sense and concept indexing.
As sense indexing,we understand using word senses as indexing units. For instance, we could use
the pair (car, sense 1) or “car s1” as indexing unit. Concept indexing involvesa word-independent
normalization that allows recognizing “car s1” and “auto-mobile s1” as occurrences of the same
concept, the noun of code 02573998 in WordNet (thus addressing synonymy and polysemy
simultaneously).
Explain the term Concept Indexing.
LOVELY PROFESSIONAL UNIVERSITY 69