Page 120 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 120

Unit 12: Vocabulary Control




          describe what a given document is actually about, even if the terms themselves do not occur within  Notes
          the document’s text. Well known subject heading systems include the Library of Congress system,
          MeSH, and Sears. Well known thesauri include the Art and Architecture Thesaurus and the ERIC
          Thesaurus.
          Choosing authorized terms to be used is a tricky business, besides the areas already considered
          above, the designer has to consider the specificity of the term chosen, whether to use direct entry,
          inter consistency and stability of the language. Lastly the amount of pre-co-ordinate (in which case
          the degree of enumeration versus synthesis becomes an issue) and post co-ordinate in the system is
          another important issue.
          Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification
          process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as
          metadata.

          12.2 Library Science

          This functional analysis provides a path away from the arguments that used to characterize information
          retrieval in the post-World War II period. Any relatively complete functional analysis of information
          storage and retrieval should provide a basis for the mapping of research and development activities.
          We suggest that the following components that have been and remain of most interest in Library
          Science:

          The Use of Human Intermediaries in Query Development

          An emphasis on incorporating external knowledge (expert descriptive cataloguing, classification,
          and assignment of subject headings) into the representation (catalogue record).
          Vocabulary control (alias authority control) in creating representations, in syndetic structure, and
          in query development.
          In online catalogues, minimally a two-stage approach (a Boolean operation to partition the
          Representations, followed by the alphabetization of the Retrieved Set) and commonly, a three stage
          approach (the two-stage approach preceded by a search of the Searchable Index only, for feedback).
          The activities generally referred to information retrieval research have historically tended to
          emphasize:
          In storage, the use of algorithmic alternatives to human expertise in creating representations and
          indexes. Good examples are automatic keyword indexes (e.g. KWIC) and the generation of vector
          space representations of documents’ terms.




                   In retrieval, the use of highly elaborate partitioning (retrieval) and transforming
                   algorithms leading to the strict ranking of a set of retrieved documents.

          Others might prefer to nominate other techniques as being characteristic of these streams of research
          and development, but any realistic mapping on to a general framework of information storage and
          retrieval theory is likely to reveal how complementary rather than contradictory these interests are.
          There is a difference in emphasis: the former tending to emphasize quality of data, consistency, and
          expert human intervention; the latter, exploring efficient algorithmic approaches to large volumes
          of data. Neither approach alone can provide a complete approach to selection systems in theory or
          in practice.





                                            LOVELY PROFESSIONAL UNIVERSITY                                  115
   115   116   117   118   119   120   121   122   123   124   125