Page 115 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 115

Information Analysis and Repackaging



                   Notes         6.1 History of Information Retrieval Model

                                 The idea of using computers to search for relevant pieces of information was popularized in the
                                 article As We May Think by Vannevar Bush in 1945. The first automated information retrieval systems
                                 were introduced in the 1950s and 1960s. By 1970 several different techniques had been shown to
                                 perform well on small text corpora such as the Cranfield collection (several thousand documents).
                                 Large-scale retrieval systems, such as the Lockheed Dialog system, came into use early in the 1970s.
                                 In 1992, the US Department of Defence along with the National Institute of Standards and Technology
                                 (NIST), cosponsored the Text Retrieval Conference (TREC) as part of the TIPSTER text program.
                                 The aim of this was to look into the information retrieval community by supplying the infrastructure
                                 that was needed for evaluation of text retrieval methodologies on a very large text collection. This
                                 catalyzed research on methods that scale to huge corpora. The introduction of web search engines
                                 has boosted the need for very large scale retrieval systems even further.
                                 The use of digital methods for storing and retrieving information has led to the phenomenon of
                                 digital obsolescence, where a digital resource ceases to be readable because the physical media, the
                                 reader required to read the media, the hardware, or the software that runs on it, is no longer available.
                                 The information is initially easier to retrieve than if it were on paper, but is then effectively lost.

                                 6.2 General Model of Information Retrieval

                                 The goal of information retrieval (IR) is to provide users with those documents that will satisfy their
                                 information need. We use the word “document” as a general term that could also include non-textual
                                 information, such as multimedia objects. (Figure 1 ahead) provides a general overview of the
                                 information retrieval process, which has been adapted from Lancaster and Warner (1993). Users have
                                 to formulate their information need in a form that can be understood by the retrieval mechanism.
                                 There are several steps involved in this translation process that we will briefly discuss below. Likewise,
                                 the contents of large document collections need to be described in a form that allows the retrieval
                                 mechanism to identify the potentially relevant documents quickly. In both cases, information may be
                                 lost in the transformation process leading to a computer-usable representation. Hence, the matching
                                 process is inherently imperfect.
                                 Information seeking is a form of problem solving (Marcus 1994, Marchionini 1992). It proceeds
                                 according to the interaction among eight sub processes: problem recognition and acceptance, problem
                                 definition, search system selection, query formulation, query execution, examination of results
                                 (including relevance feedback), information extraction, and reflection/iteration/termination. To
                                 be able to perform effective searches, users have to develop the following expertise: knowledge
                                 about various sources of information, skills in defining search problems and applying search
                                 strategies, and competence in using electronic search tools.
                                 Marchionini (1992) contends that some sort of spreadsheet is needed that supports users in the
                                 problem definition as well as other information seeking tasks. The Info Crystal is such a spreadsheet
                                 because it assists users in the formulation of their information needs and the exploration of the
                                 retrieved documents, using the a visual interface that supports a “what-if” functionality. He further
                                 predicts that advances in computing power and speed, together with improved information retrieval
                                 procedures, will continue to blur the distinctions between problem articulation and examination of
                                 results.




                                          The Info Crystal is both a visual query language and a tool for visualizing retrieval
                                          results.
                                 The information need can be understood as forming a pyramid, where only its peak is made visible
                                 by users in the form of a conceptual query (see Figure 6.1). The conceptual query captures the key





            110                              LOVELY PROFESSIONAL UNIVERSITY
   110   111   112   113   114   115   116   117   118   119   120