Page 120 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 120
Unit 12: Vocabulary Control
describe what a given document is actually about, even if the terms themselves do not occur within Notes
the document’s text. Well known subject heading systems include the Library of Congress system,
MeSH, and Sears. Well known thesauri include the Art and Architecture Thesaurus and the ERIC
Thesaurus.
Choosing authorized terms to be used is a tricky business, besides the areas already considered
above, the designer has to consider the specificity of the term chosen, whether to use direct entry,
inter consistency and stability of the language. Lastly the amount of pre-co-ordinate (in which case
the degree of enumeration versus synthesis becomes an issue) and post co-ordinate in the system is
another important issue.
Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification
process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as
metadata.
12.2 Library Science
This functional analysis provides a path away from the arguments that used to characterize information
retrieval in the post-World War II period. Any relatively complete functional analysis of information
storage and retrieval should provide a basis for the mapping of research and development activities.
We suggest that the following components that have been and remain of most interest in Library
Science:
The Use of Human Intermediaries in Query Development
An emphasis on incorporating external knowledge (expert descriptive cataloguing, classification,
and assignment of subject headings) into the representation (catalogue record).
Vocabulary control (alias authority control) in creating representations, in syndetic structure, and
in query development.
In online catalogues, minimally a two-stage approach (a Boolean operation to partition the
Representations, followed by the alphabetization of the Retrieved Set) and commonly, a three stage
approach (the two-stage approach preceded by a search of the Searchable Index only, for feedback).
The activities generally referred to information retrieval research have historically tended to
emphasize:
In storage, the use of algorithmic alternatives to human expertise in creating representations and
indexes. Good examples are automatic keyword indexes (e.g. KWIC) and the generation of vector
space representations of documents’ terms.
In retrieval, the use of highly elaborate partitioning (retrieval) and transforming
algorithms leading to the strict ranking of a set of retrieved documents.
Others might prefer to nominate other techniques as being characteristic of these streams of research
and development, but any realistic mapping on to a general framework of information storage and
retrieval theory is likely to reveal how complementary rather than contradictory these interests are.
There is a difference in emphasis: the former tending to emphasize quality of data, consistency, and
expert human intervention; the latter, exploring efficient algorithmic approaches to large volumes
of data. Neither approach alone can provide a complete approach to selection systems in theory or
in practice.
LOVELY PROFESSIONAL UNIVERSITY 115