Page 74 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 74

Unit 7: Sorting and Indexing




          FRBR supports user tasks showing what tasks the user of a catalogue is to be able to accomplish:  Notes
          Find information that is similar to the user’s search criteria Identify information user wants and
          eliminate information or entities user does not want Select a particular entity appropriate to user’s
          needs Obtain it through loan or remotely.
          FRBR comprises 3 groups of entities that key objects of interest to the users of bibliographic
          information :
          Group 1 represents intellectual or artistic products: Work, Expression, Manifestation, and Item.
          Group 2 entities are responsible for the intellectual or artistic content: Person and Corporate body.
          (responsibility relationships)
          Group 3 entities are subjects of Group 1 or Group 2’s intellectual endeavor, and include Concepts,
          Objects, Events, and Places.


          Self Assessment

          Fill in the blanks:
           1.   The online catalogue does not need to be stored ...... .
           2.   The growth of ...... and computerization add to the need for that quality.
           3.   The catalogue stands at the core of all ...... services.
           4.   At the moment we catalogue a variety of electronic resources, among them are ...... etc.
           5.   Recently the committee decided that the new cataloguing code will be called ...... or RDA.


          7.4 Concept Indexing

          Concept Indexing with WordNet Synsets

          The popularity of the bag of words model is justified by the fact that words and its stems carry an
          important part of the meaning of a text, specially regarding subject-based classification. However,
          this representation faces two main problems: the synonymy and the polysemy of words. These
          problems are addressed by a concept indexing model using WordNet synsets.
          The basic idea of concept indexing with WordNet synsets is recognizing the sysnets to which words
          in texts refer, and using them as terms for representation of documents in a Vector Space Model.
          Synset weights in documents can be computed using the same formulas for word stem terms in the
          bag of words representation. This concept based representation can improve IR, as commentedby
          Gonzalo et al. “(...) using WordNet sysnets as indexing space instead of word forms (...) combines
          two benefits for retrieval: one, that terms are fully disambiguated (this should improve precision);
          and two, that equivalent terms can be identified (this should improve recall).”
          It is important to note that available information in SemCor allows both sense and concept indexing.
          As sense indexing,we understand using word senses as indexing units. For instance, we could use
          the pair (car, sense 1) or “car s1” as indexing unit. Concept indexing involvesa word-independent
          normalization that allows recognizing “car s1” and “auto-mobile s1” as occurrences of the same
          concept, the noun of code 02573998 in WordNet (thus addressing synonymy and polysemy
          simultaneously).




                   Explain the term Concept Indexing.






                                            LOVELY PROFESSIONAL UNIVERSITY                                   69
   69   70   71   72   73   74   75   76   77   78   79