Page 92 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 92

Unit 9: Trends in Indexing




          Assigning terms, which is not simple substitutions of synonyms, but which represents independent  Notes
          conceptualizations of document contents may turn out to be the most important area in which
          human indexing is better than automatic indexing.
          As traditional classification is a time-consuming and expensive process, it is obvious that
          investigations into the use of automated solutions are worthwhile. At the same time, classification
          is an activity where a significant level of human expertise, abstract thinking and understanding is
          needed and this is not easy to replace by artificial intelligence or expert systems. There are no known
          examples of traditional library classification being undertaken completely by computer software.
          Knowledge structuring on the Internet has to cope with far larger numbers of resources, exponential
          growth rates and a high risk of changes occurring in documents which already exist.
          This is the background to a growing number of research projects and experimental systems which
          are trying to support knowledge-structuring activities on the Internet with automatic methods.
          Most of these projects use methods of derived indexing, i.e. they extract information from the
          documents and then use it for structuring tasks.
          Automated classification will probably not replace intellectual classification as far as quality subject
          services are concerned, but will rather support and complement selection and subject indexing
          efforts. Intellectual classification is always needed to validate and improve the automatic methods.
          However, robot-generated databases, as an add-on to quality services in a subject area, will be
          automatically classified. One practical goal in DESIRE II is to explore simple applications of
          automated classification methods on a robot-generated subject index to the Web.




                   Many different tests will be carried out on the ‘All’ Engineering (AE) robot-generated
                   database of engineering documents from the Internet.
          The effort required will be studied and the resulting outcomes evaluated. A pilot service of the ‘All’
          Engineering Web index will offer a full classification and browsing structure with the most suitable
          solution found during the project. In addition, a comprehensive state-of-the-art report on projects,
          methods, alternatives and problems concerning automatic classification will also be presented.

          9.2 Assigned Indexing

          Assigned terms may, on the one hand simply substitute terms represented in the document with
          other terms, e.g. from a controlled vocabulary. On the other hand, an assigned term may represent a
          conceptualization of the document, which is not expressed in the document with any terms. A romantic
          poem, for example, does not describe itself as such, but may be assigned the term “romantic poem”.
          It is common to classify documents according to an organization of disciplines.
          Documents may or may not describe their disciplinary memberships. Even if they do, the authors
          organization of disciplines may be different from those chosen to be assigned by a library or an
          information system. Assigning terms, which is not a simple substitutions of synonyms, but which
          represents independent conceptualizations of document contents may turn out to be the most
          important area in which human indexing is better than automatic indexing.
          From the preceding discussion, it is clear that if the terms are selected from the title or the text of a
          document and used without any alteration as index terms, then this is referred to as natural language
          indexing or derived indexing. If however, the selected terms are translated or encoded into authorized
          terms by the help of a prescribed list, then the indexing language becomes controlled or artificial.
          This process is called Assigned Indexing.







                                            LOVELY PROFESSIONAL UNIVERSITY                                   87
   87   88   89   90   91   92   93   94   95   96   97