Page 227 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 227

Unit 12: Natural Language Processing




             Rule                                    Input                                      Notes
             S → .NP VP                              they fish
             NP → .they                              they fish

             NP → they.                              fish
            S → NP .VP                               fish
            VP → .Vi                                 fish

            Vi → .fish                               fish
            Vi → fish.
            VP → Vi.
            S → NP VP.

          Of course, in parsing this sentence we have magically chosen the correct rule at each point, and
          ignored choices which would not lead to a successful parse. To fully specify the algorithm we
          would need to eliminate all this magic and provide an exhaustive mechanism for trying out all
          possibilities. We can do this by non-deterministically trying out all alternatives at each point as
          they arise, or, more flexibly, by casting our algorithm in the form of a set of operations on
          representations that encode the state of a parse. Each parsing step takes an ‘item’, as we shall call
          them, and produces a set of new items. These alternatives can be pursued in whatever order is
          convenient. Parsing proceeds from an initial seed item and ends when no more steps are possible.
          Each remaining item, if there are any, will then represent a successful parse of the input. An item
          consists of a list of ‘dotted trees’ with the most recent to the right, and a list of unconsumed
          words. ‘Dotted trees’ are derived from dotted rules in an obvious way (a rule like S -¿ NP .VP is
          represented [S NP .VP]) and represent the partial analyses built so far. The list of words is the
          input remaining to be parsed. There are three basic operations that can be performed on an item:
          (i)  If there is a word after a dot in the most recent tree that matches the next word in the input,
               make a new item like the current one except that the dot follows the word, and the next
               word is removed from the input. For example, an item of the form. < ... [NP .they], [they
               can fish]> will yield a new item:
               < ... [NP they.], [can fish] >
          (ii)  If the most recent tree has the dot at the end of its daughters, integrate it with the tree to its
               left, if there is one. So an item like:
               <[S .NP VP], [NP they.], [can fish]>
               yields a new one:

               <[S [NP they] .VP], [can fish]>
          (iii)  If there is a rule whose left hand side matches the category following the dot in the most
               recent tree, make a new item like the old one with the addition of a new tree derived from
               the rules, with the dot preceding its daughters.
               An item like

               <[S .NP VP], [they can fish]>
               yields one like:
               <[S .NP VP], [NP .they], [they can fish]>





                                           LOVELY PROFESSIONAL UNIVERSITY                                   221
   222   223   224   225   226   227   228   229   230   231   232