Page 235 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 235

Unit 12: Natural Language Processing




          to be created can be determined when new stories are read. Generalizations are analyzed as to  Notes
          their critical parts and evaluated in light of later evidence. The knowledge structures used and a
          number of examples of the system in operation are presented.

          12.5.1 Natural Language Systems

          The Natural Language System is a software system designed to answer questions that are posed
          to it in natural language. START parses incoming questions, matches the queries created from
          the parse trees against its knowledge base and presents the appropriate information segments
          to the user. In this way, START provides untrained users with speedy access to knowledge that
          in many cases would take an expert some time to find.
          START (SynTactic Analysis using Reversible Transformations) was developed by Boris Katz at
          MIT’s Artificial Intelligence Laboratory. Currently, the system is undergoing further
          development by the InfoLab Group, led by Boris Katz. This system was first connected to the
          World Wide Web in December, 1993, and in its several forms has to date answered millions of
          questions from users around the world.

          A key technique called “natural language annotation” helps to connect information seekers to
          information sources. This technique employs natural language sentences and phrases –
          annotations – as descriptions of content that are associated with information segments at various
          granularities. An information segment is retrieved when its annotation matches an input question.
          This method allows this system to handle all variety of media, including text, diagrams, images,
          video and audio clips, data sets, Web pages, and others.
          The natural language processing component of this system consists of two modules that share
          the same grammar. The understanding module analyzes English text and produces a knowledge
          base that encodes information found in the text. Given an appropriate segment of the knowledge
          base, the generating module produces English sentences. Used in conjunction with the technique
          of natural language annotation, these modules put the power of sentence-level natural language
          processing to use in the service of multimedia information access.

          12.5.2 Recognition and Classification Process

          It is generally easy for a person to differentiate the sound of a human voice, from that of a violin;
          a handwritten numeral “3,” from an “8”; and the aroma of a rose, from that of an onion. However,
          it is difficult for a programmable computer to solve these kinds of perceptual problems. These
          problems are difficult because each pattern usually contains a large amount of information, and
          the recognition problems typically have an inconspicuous, high-dimensional, structure.
          Pattern recognition is the science of making inferences from perceptual data, using tools from
          statistics, probability, computational geometry, machine learning, signal processing, and
          algorithm design. Thus, it is of central importance to artificial intelligence and computer vision,
          and has far-reaching applications in engineering, science, medicine, and business. In particular,
          advances made during the last half century, now allow computers to interact more effectively
          with humans and the natural world (e.g., speech recognition software). However, the most
          important problems in pattern recognition are yet to be solved.

          It is natural that we should seek to design and build machines that can recognize patterns. From
          automated speech recognition, fingerprint identification, optical character recognition, DNA
          sequence identification, and much more, it is clear that reliable, accurate pattern recognition by
          machine would be immensely useful. Moreover, in solving the indefinite number of problems
          required to build such systems, we gain deeper understanding and appreciation for pattern
          recognition systems. For some problems, such as speech and visual recognition, our design




                                           LOVELY PROFESSIONAL UNIVERSITY                                   229
   230   231   232   233   234   235   236   237   238   239   240