Page 203 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 203
Information Analysis and Repackaging
Notes The first step involves an analysis of the request (submitted by the user) to determine what the user
is really looking for, and the second step involves translation of the conceptual analysis to the
vocabulary of the system. Thus there is a close resemblance between indexing and search process.
There are two major objectives of vocabulary control in an information retrieval environment:
(a) to promote the consistent representation of subject matter by indexers and searchers,
thereby avoiding the dispersion of related materials. This is achieved through the control.
(merging) of synonymous and near synonymous expressions and by distinguishing among
homographs;
(b) to facilitate the conduct of a comprehensive search on some topic by linking together
terms whose meanings are related. Lancaster [1986] further adds that indexing tends to
be more consistent when the vocabulary used is controlled, because indexers are more
likely to agree on the terms needed to describe a particular topic if they are selected from
a pre-established list than when given a free hand to use any terms they wish.
Similarly, from the searcher’s point of view, it is easier to identify the terms appropriate to information
needs if these terms must be selected from a definitive list. Thus, controlled vocabulary tends to
match the language of indexers and searchers. A large number of documents have appeared covering
the details of various vocabulary control tools [for example, Aitchison and Gilchrist, 2000]. There
are also standards such as the British Standards (BS 5723 and BS 6723), International Standards
(such as ISO 2788 and ISO 5964), and UNISIST guidelines.
A number of vocabulary control tools have been designed over the years: they differ in their structure
and design features, but they all have the same purpose in an information retrieval environment.
Availability of vocabulary control helps both the indexers, i.e., people who are engaged in creating
document records, particularly those who create subject representation for the documents (by using
keywords, in a post-coordinate system, for example), as well as the end-users in the formulation of
their search expressions.
Now it should be clear that a natural language system suffers from varieties of problems in the
context of development of an index file. Thus, the need for control of the vocabularies arises. A
controlled vocabulary refers to an authority list of terms showing the inter-relationships and
indicating the ways in which they may be combined to represent specific subject of a document.
A certain degree of semantic structure is introduced in the controlled vocabulary so that terms
whose meanings are related may be brought together or linked in some ways. This semantic structure
is incorporated by means of (a) controlling the synonyms, word forms, etc. and distinguishing
homographs for consistent representation of the subject of the documents; and (b) providing
mechanism to link the hierarchical and non-hierarchical terms that are related semantically to
facilitate comprehensive search.
Different techniques of vocabulary control have been adopted in the tools have List of
Subject Headings (LSH), Thesaurus, Thesauro facet, etc.
11.4 Vocabulary Tools
There are two main kinds of controlled vocabulary tools used in libraries: subject headings and thesauri.
While the differences between the two are diminishing, there are still some minor differences.
Historically subject headings were designed to describe books in library catalogues by cataloguers
while thesauri were used by indexers to apply index terms to documents and articles. Subject
headings tend to be broader in scope describing whole books, while thesauri tend to be more
specialized covering very specific disciplines. Also because of the card catalogue system, subject
headings tend to have terms that are in indirect order (though with the rise of automated systems
this is being removed), while thesaurus terms are always in direct order.
198 LOVELY PROFESSIONAL UNIVERSITY