Page 234 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 234
Unit 11: Indexing Language: Types and Characteristics
Knowledge itself is indexed (somehow) within the human mind. The indexer reads the word “laser” Notes
and calls up what the indexer knows about lasers, which helps in the interpretation of the text.
Unlike back of the book indexes, the human mind’s index is not alphabetical. One has no sense of
thinking “laser, that is an l word, after lap and before lattice.” The best current theory is that language
access in the human brain is nearly holographic and that this is possible due to the nature of neural
networks.
Language Applications and Indexing Techniques
Since back-of-book indexes solve some difficult (for machines) real-world language problems, it
should not be a surprise that indexing paradigms can be useful in solving language problems other
than creating indexes themselves. In this section general questions of the relationship of indexes to
language will be considered.
Internet Search Mechanisms as Indexing
Internet search engines typically produce temporary index-like search results to World Wide Web
content. One mechanism for generating and sorting such temporary indexes, reputedly used by
Google, involves counting the number of links into a particular Web page. This allows an algorithm
to measure how many Web page creators thought a particular page, usually about a particular
subject, was important enough to point to. Thus the assigning of importance problem discussed
above is solved by surveying the aggregate assigning of importance by the humans involved in
constructing the data pages of the Web.
A book and its index could be placed on the Web (or similar system) and then trial users tracked to
see what subject matter they were seeking and how effective the index was in helping them. Using
the results of what people actually found or failed to find using the index, it would be possible to
construct an index of the book based solely on reader usages. Unused entries could be eliminated,
allowing for a compact printed version of the index. An index compiled in such a way should save
future users time while allowing the print version of the index to be relatively compact. Such an
index would in effect contain the weighted knowledge base correlations of all the readers in the
sample.
While an index for a book could be constructed this way, the economics are currently prohibitive.
No book publisher is likely to undertake the development costs of such a system.
Language as an Index
In the mind/brain of each person who knows a language such as English, the vocabulary of the
language serves as an index to the known world (but not the only index).
For each individual person the known world is primarily in past time and represented as memories.
Included in those memories is a vocabulary and a grammar that are intertwined with other memories.
A person who has not seen an elephant in person may remember that “elephant” is a large land
animal with certain characteristics because that person read it in a book or saw it on TV or heard it
from another person.
The word “elephant” serves as an index, or pointer, to what the person knows about a certain
animal. Any given word such as elephant may be connected to a variety of memories. In some cases
language fits the entry/subentry model of book indexes very well, for instance “cat” being a general
category, but if questioned someone might say, “but there are other cats besides the house cat, for
instance lions and tigers.”
LOVELY PROFESSIONAL UNIVERSITY 229