Page 232 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 232

Unit 11: Indexing Language: Types and Characteristics




            Again, an algorithm could generate this. It is much more compact than a full coverage. The problem  Notes
            is it expects too much of the user. First, the user often has to do two lookups instead of one. In
            addition, users often don’t know the terms they need to look up. For instance, a reader does not
            know the name of the LowObject, but only of the Mid1Object. The reader then has to find Mid2Object
            to find the name LowObject.
            Good human indexers can produce an index of any reasonable length that minimizes user lookup
            time and maximizes user success rates. The result, for our example, is almost always somewhere
            between the complete coverage and absolute minimal coverage.
            Human indexers can do that because they work with three maps in their heads: the map of the book
            or text being indexed, the map of the subject area, and the map of the knowledge levels and mental
            habits of likely users. A good indexer will know in great detail when to use full coverage and when
            to be selective. Could a computer program and database accomplish the same? None do yet. When
            creating an index of a book that has subject hierarchy issues, a human indexer will rely heavily on
            the concept of “importance”.

            Importance

            The main point of book indexing is to speed up human retrieval of meaningful information. For
            that reason over-indexing, which may lead to multiple fruitless searches, is not a good solution. At
            the same time printing a complete (in terms of coverage) index is usually prohibited by cost
            considerations.
            So one consideration professionals give considerable thought to while indexing a work is deciding
            which topics do require entries, and which do not. Indexers of books who do not understand the
            subject matter may take a machine-like approach to this task. Their rule might be if it is a noun,
            index it. If their publisher is not interested in providing the reader of the book with an index that is
            a quarter as long as the book itself, despite being in 6 point type, the indexer will be asked to
            shorten the index, which is to say, guess at which entries are important.
            As usual, professional indexers have provided some rules of thumb for this. The most basic is: the
            more the author writes about a subject, the more important it is. A topic with an entire chapter
            devoted to it is more important than a topic that has a couple of pages devoted to it, which in turn
            is more important than topic covering a single paragraph or sentence. At the bottom of the priority
            list is topics  are merely mentioned.
            We can imagine, if the page-range problem can be solved, that an MI could use the above general
            rule to measure the importance of a term becoming an index entry. Given the allowed length of the
            printed index, the terms with least importance could be eliminated with great precision.
            But we know that a single sentence, say a key definition, may be more important than covering
            longer lengths of text that add little to the discussion. So, given a goal of helping a human user, the
            indexer’s knowledge base and judgement are going to do far better at sorting terms in order of
            importance than any algorithm based on text length.
            In fact human indexers make judgements as to importance as they read the text; they are often able
            to draft an extremely usable index approximating a required length (say 5% of the overall text
            length) on a single pass.
            Helping the User
            Professional indexers have many rules and guidelines for constructing indexes, some of which  have
            identified above. Naturally some indexers are more rule-oriented than others. Whenever there are
            a set of rules that are to be obeyed in a complex terrain, at times one or more rules will conflict with
            each other.
            The overriding rule, when creating book indexes, is to help the user. This may seem like a very
            vague rule, but all human index writers are also index users. Hopefully they use indexes of books





                                             LOVELY PROFESSIONAL UNIVERSITY                                   227
   227   228   229   230   231   232   233   234   235   236   237