Page 245 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 245

Information Analysis and Repackaging



                   Notes         Computer database indexing in practice

                                 In practice in database indexing, there is a continuum of use of computers, from no computer at all
                                 to fully automatic indexing.
                                    •  No computer.
                                    •  Computer clerical support, e.g., for data entry.
                                    •  Computer quality control, e.g., checking that all index terms are valid thesaurus terms.
                                    •  Computer intellectual assistance, e.g., helping with term choice and weighting.

                                 Automatic Indexing
                                 Most database producers use computers at a number of different steps along this continuum. At the
                                 moment, however, automatic indexing is only ever used for a part of a database, for example, for a
                                 specific subject, access point, or document type.
                                 Automatic indexing is used by the Defense Technology Information Center (DTIC) for the
                                 management-related literature in its database; it is used by FIZ Karlsruhe for indexing chemical
                                 names; it was used until 1992 by the Russian International Centre for Scientific and Technical
                                 Information (ICSTI) for its Russian language materials; and it was used by INSPEC for the re-indexing
                                 of its backfiles to new standards (Hodge 1994).
                                 BIOSIS (Biological Abstracts) uses computers at all steps on the continuum, and uses automatic
                                 indexing in a number of areas. Title keywords are mapped by computer to the Semantic Vocabulary
                                 of 15,000 words; the terms from the Semantic Vocabulary are then mapped to one of 600 Concept
                                 Headings (that is, subject headings which describe the broad subject area of a document; Lancaster
                                 1991).
                                 The version of BIOSIS Previews available on the database host STN International uses automatic
                                 indexing to allocate Chemical Abstracts Service Registry Numbers to articles to describe the
                                 chemicals, drugs, enzymes and biosequences discussed in the article. The codes are allocated without
                                 human review, but a human operator spends five hours per month maintaining authority files and
                                 rules (Hodge 1994).

                                 Retrieval and Ranking Tools
                                 There are two sides to the information retrieval process: documents must be indexed (by humans or
                                 computers) to describe their subject content; and documents must be retrieved using retrieval
                                 software and appropriate search statements. Retrieval and ranking tools include those used with
                                 bibliographic databases, the ‘indexes’ used on the Internet, and personal computer software packages
                                 such as Personal Librarian (Koll 1993). Some programs, such as ISYS, are specialised for the fast
                                 retrieval of search words.
                                 In theory these are complementary approaches, and both are needed for optimal retrieval. In practice,
                                 however, especially with documents in full-text databases, indexing is often omitted, and the retrieval
                                 software is relied on instead.




                                             For these documents, which will not be indexed, it is important to ensure the best
                                             possible access. To accomplish this, the authors of the documents must be aware
                                             of the searching methods which will be used to retrieve them. Authors must use
                                             appropriate keywords throughout the text, and ensure that keywords are included
                                             in the title and section headings, as these are often given priority by retrieval and
                                             ranking tools (Sunter 1995).





            240                              LOVELY PROFESSIONAL UNIVERSITY
   240   241   242   243   244   245   246   247   248   249   250