Page 219 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 219

Information Analysis and Repackaging



                   Notes            •  Forms of: Bibliometric Knowledge Organization
                                    •  Approaches based on Exemplary documents
                                 Conclusion
                                 Automatic indexing may–at first–look like a reasonable limited and well-defined research topic.
                                 Important developments have taken place, the practical implication which most of us use almost
                                 every day. However, there seems to be no limits to how automatic indexing may be improved and
                                 how the theoretical outlook opens-up. Nearly every aspect of human language may be involved in
                                 the improvement machine processing of language (and each natural language may need special
                                 consideration).
                                 Language is again connected to human action and to cultural and social issues, and a given natural
                                 language is not just one well-defined thing, why forms of sublanguages also have to be considered.
                                 Research in automatic indexing is no longer primarily a question of better computers, but primarily
                                 a question of better understanding of human language and the social actions that this language is
                                 serving.
                                 Assigned indexing which is not just a not simple substitutions of document terms with synonyms,
                                 but which represents independent conceptualizations of document contents may turn out to be the
                                 most important area in which human indexing performs better than automatic indexing (for example
                                 assigning “romantic poem” to a poem, which does not describe itself as such).

                                 Indexing Language
                                 Suitability for our Apple Environment
                                 Our thesaurus will meet the needs of our indexers by providing them with a searchable database of
                                 apple characteristics that they will use to describe the characteristics of apple varieties and identify
                                 various uses for apples. This list of apple varieties will then be used by the general public to identify
                                 the right type of apple for their needs and desires. Keeping these end-users in mind, we have aimed
                                 to consistently select the most common and simple word or phrase as our preferred term when
                                 several options were available.
                                 Creating records for this database will require the services of people thoroughly familiar with
                                 different varieties of apples. This is necessary in order for the indexers to be able to describe the
                                 apples in such a way that distinguishes between varieties in a meaningful way. Each apple variety
                                 will be tasted and described by a panel of apple judges to attempt to achieve consensus on the
                                 properties of the apples being described.

                                 Type of Indexing Language
                                 Our thesaurus is a controlled language thesaurus and is based on the ANSI/NISO Z39.19.1993
                                 Standard Guidelines for the Construction, Format, and Management of Monolingual Thesauri.
                                 Indexers will be able to select from a list of terms when they are describing the apple varieties. They
                                 will also have access to a list of pre-approved modifiers such as “very” and “light.”
                                 Pre-coordinate Headings and Post-coordinate Retrieval

                                 Our thesaurus will be made searchable through an online database. Therefore, it uses precoordinate
                                 headings designed for post-coordinate retrieval. The advantage of this is that it limits the number
                                 of terms in our thesaurus. For example, when including terms for various apple colours, we used
                                 only basic colours such as red and yellow, even though some apples could be described as having a
                                 reddish-yellow colour.


                                 Forms of Terms
                                 Our thesaurus includes both single-word and multi-word terms whose grammatical forms follow
                                 the standards outlined in sections 3.4 and 3.5. We used pluralization as needed. As most words in



            214                              LOVELY PROFESSIONAL UNIVERSITY
   214   215   216   217   218   219   220   221   222   223   224