Page 203 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 203

Information Analysis and Repackaging



                   Notes         The first step involves an analysis of the request (submitted by the user) to determine what the user
                                 is really looking for, and the second step involves translation of the conceptual analysis to the
                                 vocabulary of the system. Thus there is a close resemblance between indexing and search process.
                                 There are two major objectives of vocabulary control in an information retrieval environment:
                                       (a) to promote the consistent representation of subject matter by indexers and searchers,
                                          thereby avoiding the dispersion of related materials. This is achieved through the control.
                                          (merging) of synonymous and near synonymous expressions and by distinguishing among
                                          homographs;
                                       (b) to facilitate the conduct of a comprehensive search on some topic by linking together
                                          terms whose meanings are related. Lancaster [1986] further adds that indexing tends to
                                          be more consistent when the vocabulary used is controlled, because indexers are more
                                          likely to agree on the terms needed to describe a particular topic if they are selected from
                                          a pre-established list than when given a free hand to use any terms they wish.
                                 Similarly, from the searcher’s point of view, it is easier to identify the terms appropriate to information
                                 needs if these terms must be selected from a definitive list. Thus, controlled vocabulary tends to
                                 match the language of indexers and searchers. A large number of documents have appeared covering
                                 the details of various vocabulary control tools [for example, Aitchison and Gilchrist, 2000]. There
                                 are also standards such as the British Standards (BS 5723 and BS 6723), International Standards
                                 (such as ISO 2788 and ISO 5964), and UNISIST guidelines.
                                 A number of vocabulary control tools have been designed over the years: they differ in their structure
                                 and design features, but they all have the same purpose in an information retrieval environment.
                                 Availability of vocabulary control helps both the indexers, i.e., people who are engaged in creating
                                 document records, particularly those who create subject representation for the documents (by using
                                 keywords, in a post-coordinate system, for example), as well as the end-users in the formulation of
                                 their search expressions.
                                 Now it should be clear that a natural language system suffers from varieties of problems in the
                                 context of development of an index file. Thus, the need for control of the vocabularies arises. A
                                 controlled vocabulary refers to an authority list of terms showing the inter-relationships and
                                 indicating the ways in which they may be combined to represent specific subject of a document.
                                 A certain degree of semantic structure is introduced in the controlled vocabulary so that terms
                                 whose meanings are related may be brought together or linked in some ways. This semantic structure
                                 is incorporated by means of (a) controlling the synonyms, word forms, etc. and distinguishing
                                 homographs for consistent representation of the subject of the documents; and (b) providing
                                 mechanism to link the hierarchical and non-hierarchical terms that are related semantically to
                                 facilitate comprehensive search.





                                         Different techniques of vocabulary control have been adopted in the tools have List of
                                         Subject Headings (LSH), Thesaurus, Thesauro facet, etc.


                                 11.4 Vocabulary Tools
                                 There are two main kinds of controlled vocabulary tools used in libraries: subject headings and thesauri.
                                 While the differences between the two are diminishing, there are still some minor differences.
                                 Historically subject headings were designed to describe books in library catalogues by cataloguers
                                 while thesauri were used by indexers to apply index terms to documents and articles. Subject
                                 headings tend to be broader in scope describing whole books, while thesauri tend to be more
                                 specialized covering very specific disciplines. Also because of the card catalogue system, subject
                                 headings tend to have terms that are in indirect order (though with the rise of automated systems
                                 this is being removed), while thesaurus terms are always in direct order.



            198                              LOVELY PROFESSIONAL UNIVERSITY
   198   199   200   201   202   203   204   205   206   207   208