Page 226 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 226
Introduction to Artificial Intelligence & Expert Systems
Notes complexity and termination properties may be different. Parsing algorithms are usually designed
for classes of grammar rather than tailored towards individual grammars.
Notes There are several important properties that a parsing algorithm should have if it is
to be practically useful. It should be ‘sound’ with respect to a given grammar and lexicon;
that is, it should not assign to an input sentence analyses which cannot arise from the
grammar in question.
It should also be ‘complete’; that it should assign to an input sentence all the analyses it can have
with respect to the current grammar and lexicon. Ideally, the algorithm should also be ‘efficient’,
entailing the minimum of computational work consistent with fulfilling the first two
requirements, and ‘robust’: behaving in a reasonably sensible way when presented with a
sentence that it is unable to fully analyse successfully. In this discussion, we generally ignore the
latter two requirements: to some extent, they are covered by the companion section on ‘chart
parsing’. There are several other dimensions on which is useful to characterise the behaviour of
parsing algorithms. One can characterise their search strategy in terms of the characteristic
alternatives of depth-first or breadth-first (q.v.). Orthogonally, one can characterise them in
terms of the direction in which a structure is built: from the words upwards (‘bottom up’), or
from the root node downwards (‘top down’). A third dimension is in terms of the sequential
processing of input words: usually this is left-to-right, but right-to-left or ‘middle-out’ strategies
are also feasible and may be preferable in some applications (e.g. parsing the output of a speech
recogniser).
12.3.1 Top Down Parsing
We can illustrate a simple top down algorithm for a CFG as follows. Let the grammar contain
the following rules (which for simplicity also introduce lexical entries instead of presupposing
a separate component for this):
S → NP VP
NP → they | fish
VP → Aux VP
VP → Vi
VP → Vt NP
Aux → can
Vi → fish
Vt → can
This will assign to the sentence ‘they can sh’ two distinct analyses corresponding to two
interpretations ‘they are able/permitted to sh’ and ‘they put sh in cans’. (It will also generate
several other good sentences and lots of odd ones). The top down algorithm is very simple. We
begin with the start symbol, in this case, S, and see what rules it figures in as the mother. Then
we look at the daughters of these rules and see whether the first one matches the next word in the
input. If it does, we do the same for the remaining daughters. If not, we go through the same
loop again, this time matching the daughters of the rules already found with mothers of other
rules. The algorithm is easily illustrated on the input sentence ‘they sh’: the ‘dot’ precedes the
daughter of the rule we are currently looking at, and the level of indentation indicates which
rule has invoked the current action.
220 LOVELY PROFESSIONAL UNIVERSITY