Page 234 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 234

Unit 11: Indexing Language: Types and Characteristics




            Knowledge itself is indexed (somehow) within the human mind. The indexer reads the word “laser”  Notes
            and calls up what the indexer knows about lasers, which helps in the interpretation of the text.
            Unlike back of the book indexes, the human mind’s index is not alphabetical. One has no sense of
            thinking “laser, that is an l word, after lap and before lattice.” The best current theory is that language
            access in the human brain is nearly holographic and that this is possible due to the nature of neural
            networks.

            Language Applications and Indexing Techniques

            Since back-of-book indexes solve some difficult (for machines) real-world language problems, it
            should not be a surprise that indexing paradigms can be useful in solving language problems other
            than creating indexes themselves. In this section general questions of the relationship of indexes to
            language will be considered.

            Internet Search Mechanisms as Indexing
            Internet search engines typically produce temporary index-like search results to World Wide Web
            content. One mechanism for generating and sorting such temporary indexes, reputedly used by
            Google, involves counting the number of links into a particular Web page. This allows an algorithm
            to measure how many Web page creators thought a particular page, usually about a particular
            subject, was important enough to point to. Thus the assigning of importance problem discussed
            above is solved by surveying the aggregate assigning of importance by the humans involved in
            constructing the data pages of the Web.
            A book and its index could be placed on the Web (or similar system) and then trial users tracked to
            see what subject matter they were seeking and how effective the index was in helping them. Using
            the results of what people actually found or failed to find using the index, it would be possible to
            construct an index of the book based solely on reader usages. Unused entries could be eliminated,
            allowing for a compact printed version of the index. An index compiled in such a way should save
            future users time while allowing the print version of the index to be relatively compact. Such an
            index would in effect contain the weighted knowledge base correlations of all the readers in the
            sample.
            While an index for a book could be constructed this way, the economics are currently prohibitive.
            No book publisher is likely to undertake the development costs of such a system.

            Language as an Index
            In the mind/brain of each person who knows a language such as English, the vocabulary of the
            language serves as an index to the known world (but not the only index).
            For each individual person the known world is primarily in past time and represented as memories.
            Included in those memories is a vocabulary and a grammar that are intertwined with other memories.
            A person who has not seen an elephant in person may remember that “elephant” is a large land
            animal with certain characteristics because that person read it in a book or saw it on TV or heard it
            from another person.
            The word “elephant” serves as an index, or pointer, to what the person knows about a certain
            animal. Any given word such as elephant may be connected to a variety of memories. In some cases
            language fits the entry/subentry model of book indexes very well, for instance “cat” being a general
            category, but if questioned someone might say, “but there are other cats besides the house cat, for
            instance lions and tigers.”






                                             LOVELY PROFESSIONAL UNIVERSITY                                   229
   229   230   231   232   233   234   235   236   237   238   239