Page 225 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 225
Information Analysis and Repackaging
Notes Precision/Recall
Recall refers to the number of relevant items that might be retrieved in a search, compared to the
total number of relevant items in a collection. Precision refers to the number of relevant items in a
retrieved set. Several factors in the design of the thesaurus will enhance search precision and recall.
The specificity of the controlled vocabulary will enhance precision. The use of multi-word descriptors
provides a greater degree of precision for the meaning of terms. Directing users and indexers to
preferred terms from a variety of non-preferred terms also increases precision. Scope notes help
clarify the meaning of terms, thus enhancing precision. The multi-leveled hierarchical structure
will enable users and indexers to more readily find the term specificity required. The extensive use
of associative relationships enhances recall by directing users and indexers to related aspects of
cheese.
Specificity
A high level of specificity in the terms for specific cheeses and flavour characteristics is reflected in
the thesaurus. The specificity required is determined by the anticipated needs of the store s employees
and customers and online users. Some general guidelines are suggested in the scope notes for the
terms descriptive of <cheese fat content>. Total precision is not possible as guidelines for labeling
and the designation of fat content vary from one country to another.
Most of the names for specific cheeses, such as Brie and Cheddar, can almost be considered class
terms, since varieties of each are produced in several countries. Further refinement of these terms
might be required in the future. It is difficult to be very specifi c with regard to flavour terms. There
is no standard set of terms in widespread use, so terms were chosen primarily on the basis of them
being somewhat easily distinguishable. The many-layered hierarchical structure of the thesaurus is
an important aid to users and indexers in finding the correct level of specificity for terms desired in
a search or in indexing cheese types.
Conclusion
The depth of coverage of the thesaurus is definitely limited at present. The current thesaurus
represents the first stage in the development of what will surely become a more expansive work.
The current version reflects the rather limited range of inventory presently carried at the store.
Nineteen specific types of cheese from four European countries are contained in the current inventory.
The store plans to gradually add further varieties from these four countries, and expand to include
some varieties from other European countries, such as Denmark and the Netherlands. The thesaurus,
it is hoped, will expand to include terms involved in the making of cheese, as manufacturing processes
are very significant in determining some of the desired characteristics of cheese.
Liddys Model (2003) of Natural Language Processing
Hjorland has in several writings suggested that approaches to Library and Information Science
(LIS) are basically epistemologically approaches, why they may be classified according to
epistemological positions, e.g., in empiricist, rationalist, historicist and pragmatist approaches). For
the application of these categories to indexing in general see indexing theory). Is this classification
also possible and valid for automatic indexing?
In principle, this should be the case. However, as pointed out by Liddy has the “lower levels” of
language been thoroughly researched and implemented in natural language processing. Such lower
levels (sounds, words, sentences) are more related to automatic indexing, while higher levels
(meaning, semantics, pragmatics, discourses) are more related to human understanding and
indexing.
220 LOVELY PROFESSIONAL UNIVERSITY