Page 204 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 204
Unit 11: Indexing Language: Types and Characteristics
Subject headings also tend to use more pre-coordination of terms such that the designer of the Notes
controlled vocabulary will combine various concepts together to form one authorized subject heading.
(e.g., children and terrorism) while thesauri tend to use singular direct terms. Lastly thesauri list not
only equivalent terms but also narrower, broader terms and related terms among various authorized
and non-authorized terms, while historically most subject headings did not.
For example, the Library of Congress Subject Heading itself did not have much syndetic structure
until 1943, and it was not until 1985 when it began to adopt the thesauri type term “Broader term”
and “Narrow term”.
The terms are chosen and organized by trained professionals (including librarians and information
scientists) who possess expertise in the subject area. Controlled vocabulary terms can accurately
describe what a given document is actually about, even if the terms themselves do not occur within
the document’s text. Well known subject heading systems include the Library of Congress system,
MeSH, and Sears. Well-known thesauri include the Art and Architecture Thesaurus and the ERIC
Thesaurus.
Choosing authorized terms to be used is a tricky business, besides the areas already considered
above, the designer has to consider the specificity of the term chosen, whether to use direct entry,
inter consistency and stability of the language. Lastly the amount of pre-coordinate (in which case
the degree of enumeration versus synthesis becomes an issue) and post-coordinate in the system is
another important issue.
Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification
process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as
metadata.
11.5 Construction of an IR Thesaurus
Thesaurus Construction
Thesaurus construction is a very specialized activity. Anyone involved in its construction should
have a sound knowledge of the subject and should be logical and have organisational capabilities.
The steps for construction of a thesaurus are as follows.
• Need Analysis: While designing the thesaurus need analysis should be done first, whether it
is really needed or not. There may be existing thesaurus on similar subjects. It is necessary to
see whether it may meet the need. In some cases, an existing thesaurus can be modified to suit
the needs. If it is felt that a thesaurus needs to be constructed then following steps to be
followed.
• Gathering of Terms: The terms to be included are to be collected first. Two approaches can be
followed in this process. In the top-down approach (deductive approach), a committee iden-
tifies the terms and subdivide them from the top to down. The problems, which may be faced
are that it is difficult to think of all categories or hierarchies of a concept and the characteris-
tics used to divide the genus may not suit the users needs.
In the empirical (bottom-up) approach, terms are correlated from various sources and a cat-
egory or hierarchy is formed only if it appears to be useful. The terms are collected using two
principles - Principle of Literary warrant and Principle of User Warrant. In the former the
logic is that a term justifies its inclusion if it is used in literature of the subject. The method is
to go through abstracting sources, reference sources, periodical articles, etc. In the later case,
users/ subject specialists may be consulted to gather the terms. However, the combination of
the two yields better result.
• Organisation of Terms: Once the terms are collected, these are to be organised into major
categories and into hierarchies within the categories. Useful inter-hierarchical relationships
should also be delineated.
LOVELY PROFESSIONAL UNIVERSITY 199