Page 235 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 235
Information Analysis and Repackaging
Notes What makes human knowledge and English (or another human language) as an index different
from machine database type knowledge is the rich set of associations. Locators point to memories,
which may be words (e.g., if one first read the word elephant in an unillustrated book) or directly
from the senses. There is no division into levels of entries; to try to make a hierarchy, one would
have to have thousands of interlocking levels. Several words may have locators pointing to the
same memory: “tiger,” “danger,” and “striped” might point to the same memory, for instance.
Phrases like “my birthday” or “my alma matter” could point to large numbers of memories.
Good readers instantly (on human time scales) recognize every word in their vocabulary while
reading. Quite complex pieces of knowledge, whether from a novel or a text book, can be integrated
into the prior knowledge base as fast as the reading goes. Critical readers may spot internal
contradictions or instances where the text contradicts their prior knowledge. The auto-indexing
capabilities of the human brain are likely the key to these abilities.
Language Acquisition as Indexing
In the brief discussion of language as an index, we see that words are acquired with associations. It
would be interesting to know on what framework the brain hangs words and associations other
than general purpose biological explanations like “neural networks.”
Since the memory-language area of the brain seems to work on an association basis, do we add
anything useful (or insightful) to our picture by saying that the brain is indexing as it acquires
language and other memories or knowledge? If nothing else, it frees us from a pure neural network
model for programming these abilities. If a computer system uses a different method of operation,
but achieves the same result, then we should be able to build machine knowledge bases that can
achieve many or all of the desirable traits MIs and other machines lack at present.
In the individual human knowledge base (brain), indexing terms and locators are in a constant state
of modification. The external sensory world provides constant feedback on the quality (usefulness)
of the index. Fail to index foods properly and the result can be poisoning or starvation. Fail to index
predators properly and the result is death. In our slightly less dangerous modern society people are
often rewarded according to their ability to recall appropriate information associations.
Suppose then that we are convinced that machines need human-like knowledge bases in order to
do tasks requiring real language skills. We set out to build a machine that automatically indexes its
experiences of the world, using a language system like English. It is an obvious question to ask:
well, what happens as a human baby does it?
Having read various neuro-linguistics theorists’ theories, and having done some actual watching of
babies learning, what makes sense in terms of our indexing paradigm (and being thankful that
grammar acquisition itself is not part of this project)? Babies start having sensory experiences,
including hearing language spoken, long before they begin articulating words or showing (by their
reactions) that they understand words. At some point they learn their first word, for instance “mama,”
which serves as an index entry. Other words follow gradually and then quickly, including abstract
words like “no.” Things in the sensory world take on names, and changes are correlated with verbs.
So we can assume humans have functions that are able to correlate sounds (words) with various
events in the world. Thus words are indexes to meaningful knowledge about the world.
We must also assume functions that are able to place words and objects into the schemes we call
maps and pictures of the world.
Model for Word Meaning Acquisition
Suppose we try to construct an indexing system correlating to the word-meaning acquisition system
of humans.
Let the most basic construct in the model be the “word-object.” This could be connected to memories
of all sorts, raw and analyzed, including other word-objects. Memories would also be held in objects,
230 LOVELY PROFESSIONAL UNIVERSITY