Page 204 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 204

Unit 11: Indexing Language: Types and Characteristics




            Subject headings also tend to use more pre-coordination of terms such that the designer of the  Notes
            controlled vocabulary will combine various concepts together to form one authorized subject heading.
            (e.g., children and terrorism) while thesauri tend to use singular direct terms. Lastly thesauri list not
            only equivalent terms but also narrower, broader terms and related terms among various authorized
            and non-authorized terms, while historically most subject headings did not.
            For example, the Library of Congress Subject Heading itself did not have much syndetic structure
            until 1943, and it was not until 1985 when it began to adopt the thesauri type term “Broader term”
            and “Narrow term”.
            The terms are chosen and organized by trained professionals (including librarians and information
            scientists) who possess expertise in the subject area. Controlled vocabulary terms can accurately
            describe what a given document is actually about, even if the terms themselves do not occur within
            the document’s text. Well known subject heading systems include the Library of Congress system,
            MeSH, and Sears. Well-known thesauri include the Art and Architecture Thesaurus and the ERIC
            Thesaurus.
            Choosing authorized terms to be used is a tricky business, besides the areas already considered
            above, the designer has to consider the specificity of the term chosen, whether to use direct entry,
            inter consistency and stability of the language. Lastly the amount of pre-coordinate (in which case
            the degree of enumeration versus synthesis becomes an issue) and post-coordinate in the system is
            another important issue.
            Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification
            process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as
            metadata.

            11.5 Construction of an IR Thesaurus


            Thesaurus Construction
            Thesaurus construction is a very specialized activity. Anyone involved in its construction should
            have a sound knowledge of the subject and should be logical and have organisational capabilities.
            The steps for construction of a thesaurus are as follows.
              •  Need Analysis: While designing the thesaurus need analysis should be done first, whether it
                 is really needed or not. There may be existing thesaurus on similar subjects. It is necessary to
                 see whether it may meet the need. In some cases, an existing thesaurus can be modified to suit
                 the needs. If it is felt that a thesaurus needs to be constructed then following steps to be
                 followed.
              •  Gathering of Terms: The terms to be included are to be collected first. Two approaches can be
                 followed in this process. In the top-down approach (deductive approach), a committee iden-
                 tifies the terms and subdivide them from the top to down. The problems, which may be faced
                 are that it is difficult to think of all categories or hierarchies of a concept and the characteris-
                 tics used to divide the genus may not suit the users needs.
                 In the empirical (bottom-up) approach, terms are correlated from various sources and a cat-
                 egory or hierarchy is formed only if it appears to be useful. The terms are collected using two
                 principles - Principle of Literary warrant and Principle of User Warrant. In the former the
                 logic is that a term justifies its inclusion if it is used in literature of the subject. The method is
                 to go through abstracting sources, reference sources, periodical articles, etc. In the later case,
                 users/ subject specialists may be consulted to gather the terms. However, the combination of
                 the two yields better result.
              •   Organisation of Terms: Once the terms are collected, these are to be organised into major
                 categories and into hierarchies within the categories. Useful inter-hierarchical relationships
                 should also be delineated.





                                             LOVELY PROFESSIONAL UNIVERSITY                                   199
   199   200   201   202   203   204   205   206   207   208   209