Page 214 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 214

Unit 11: Indexing Language: Types and Characteristics




            Thesaurus maintenance                                                                    Notes
            Someone has to be responsible for this. New terms can be suggested, and temporarily “forced” into
            the thesaurus by cataloguers as they catalogue objects, but someone has to review these terms
            regularly and either accept them and build them into the thesaurus structure, or else decide that
            they are not appropriate for use as indexing terms. In that case they should generally be retained as
            non-preferred terms with USE references to the preferred terms, so that people who seek them will
            not be frustrated.
            An encouraging thought is that once the initial work of setting up the thesaurus has been done, the
            number of new terms to be assessed each week should decrease, and many systems have operated
            successfully in the past with printed thesauri, which are quite difficult to keep up to date.
            What sort of fields is a thesaurus appropriate for?
            A thesaurus is not a panacea which will meet all subject retrieval needs. It is particularly appropriate
            for fields which have a hierarchical structure, such as names of objects, subjects, places, materials
            and disciplines, and it might also be used for styles and periods. A thesaurus proper would not
            normally be used for names of people and organisations, but a similar tool, called an authority file
            is usually used for these. The difference is that while an authority file has preferred and non-preferred
            relationships, it does not have hierarchies.




                    Authority files and thesauri are two examples of a generalised data structure which
                    can allow the indication of any type of relationship between two entries, and modern
                    computer software should allow different types of relationship to be included if needed.
            Other subject retrieval techniques
            A thesaurus is an essential component for reliable information retrieval, but it can usefully be
            complemented by two other types of subject retrieval mechanism.
            Classification schemes
            While a thesaurus inherently contains a classification of terms in its hierarchical relationships, it is
            intended for specific retrieval, and it is often useful to have another way of grouping objects. This
            may relate to administrative distribution of responsibility for “collections” within a museum, or to
            subdivisions of these collections into groups which depend on local emphasis. It is also often
            necessary to be able to print a list of objects arranged by subject in a way which differs from the
            alphabetical order of thesaurus terms. Each subject group may be expressed as a compound phrase,
            and given a classification number or code to make sorting possible.
            Free text
            It is highly desirable to be able to search for specific words or phrases which occur in object
            descriptions. These may identify individual items by unique words such as trade names which do
            not occur often enough to justify inclusion in the thesaurus.
            A computer system may “invert” some or all fields of the record, i.e. making all the words in them
            available for searching through a free-text index, or it may be possible to scan records by reading
            them sequentially while looking for particular words. The latter process is fairly slow, but is a
            useful way of refining a search once an initial group has been selected by using thesaurus terms.









                                             LOVELY PROFESSIONAL UNIVERSITY                                   209
   209   210   211   212   213   214   215   216   217   218   219