Page 245 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 245
Information Analysis and Repackaging
Notes Computer database indexing in practice
In practice in database indexing, there is a continuum of use of computers, from no computer at all
to fully automatic indexing.
• No computer.
• Computer clerical support, e.g., for data entry.
• Computer quality control, e.g., checking that all index terms are valid thesaurus terms.
• Computer intellectual assistance, e.g., helping with term choice and weighting.
Automatic Indexing
Most database producers use computers at a number of different steps along this continuum. At the
moment, however, automatic indexing is only ever used for a part of a database, for example, for a
specific subject, access point, or document type.
Automatic indexing is used by the Defense Technology Information Center (DTIC) for the
management-related literature in its database; it is used by FIZ Karlsruhe for indexing chemical
names; it was used until 1992 by the Russian International Centre for Scientific and Technical
Information (ICSTI) for its Russian language materials; and it was used by INSPEC for the re-indexing
of its backfiles to new standards (Hodge 1994).
BIOSIS (Biological Abstracts) uses computers at all steps on the continuum, and uses automatic
indexing in a number of areas. Title keywords are mapped by computer to the Semantic Vocabulary
of 15,000 words; the terms from the Semantic Vocabulary are then mapped to one of 600 Concept
Headings (that is, subject headings which describe the broad subject area of a document; Lancaster
1991).
The version of BIOSIS Previews available on the database host STN International uses automatic
indexing to allocate Chemical Abstracts Service Registry Numbers to articles to describe the
chemicals, drugs, enzymes and biosequences discussed in the article. The codes are allocated without
human review, but a human operator spends five hours per month maintaining authority files and
rules (Hodge 1994).
Retrieval and Ranking Tools
There are two sides to the information retrieval process: documents must be indexed (by humans or
computers) to describe their subject content; and documents must be retrieved using retrieval
software and appropriate search statements. Retrieval and ranking tools include those used with
bibliographic databases, the ‘indexes’ used on the Internet, and personal computer software packages
such as Personal Librarian (Koll 1993). Some programs, such as ISYS, are specialised for the fast
retrieval of search words.
In theory these are complementary approaches, and both are needed for optimal retrieval. In practice,
however, especially with documents in full-text databases, indexing is often omitted, and the retrieval
software is relied on instead.
For these documents, which will not be indexed, it is important to ensure the best
possible access. To accomplish this, the authors of the documents must be aware
of the searching methods which will be used to retrieve them. Authors must use
appropriate keywords throughout the text, and ensure that keywords are included
in the title and section headings, as these are often given priority by retrieval and
ranking tools (Sunter 1995).
240 LOVELY PROFESSIONAL UNIVERSITY