Page 159 - DCAP506_ARTIFICIAL_INTELLIGENCE
P. 159

Unit 11: Natural Language Processing




               to the message. Summarization systems have been developed by e.g. Marcu, (1997). The  Notes
               approach was to parse the discourse relations (rhetorical relations, Mann & Thompson,
               1988), and on basis of the parse be able to choose what discourse segments were most
               relevant. The approach relied to a large extent on surface cues, such as discourse markers.
          3.   Natural Language Generation:  Marcu, 1997, also tested rhetorical relations for Natural
               Language Generation (NLG). On basis of rhetorical relations the ordering preferences of
               discourse segments was scored. The higher the score, the more likely that the discourse
               structure was coherent. Kibble & Power (1999) uses Centering theory for planning  the
               most coherent stretch of utterances. These brief examples are of course just a very limited
               sample of applications which uses some kind of discourse related information. Still I will
               finish here  and make some concluding  remarks, and  connect to my own dissertation
               subject.





              Task  Illustrate the term “discourse”.
          Self Assessment


          Fill in the blanks:
          6.   In .............................. Analysis, Individual worlds are scrutinized into their components and
               non word tokens, like punctuation are alienated from the words.

          7.   In .............................., Linear sequences of words are malformed into structures that illustrate
               how the words associate to each other.
          8.   ..............................  concentrates on scrutinizing the words in a sentence so as to reveal the
               grammatical arrangement of the sentence.
          9.   Design patterns intend to increase the flexibility of a model by ..............................  some
               aspects of a class.
          10.  The term  .............................. includes both spoken  and written  forms, as well as  both
               monologue and dialogue.

          11.  .............................. is what makes a collections of sentences/utterances a discourse.

          11.3 Spell Checking

          The goal of spell checking is the detection and rectification of typographic and orthographic
          faults in the text at the level of word incidence measured out of its perspective.

          No one can write without any faults. Even people well familiar with the rules of language can,
          just by misfortune, press a wrong key on the keyboard (maybe adjoining to the correct one) or
          miss out  a letter. Moreover, when  typing, one  at times does not  harmonize  correctly  the
          movements of the hands and fingers. All such errors are known as typos , or  typographic errors.
          Alternatively, some people do not recognize the correct spelling of some words, particularly in
          a foreign language. Such errors are known as spelling errors.
          Initially, a spell checker simply detects the strings that are not accurate words in a specified
          natural language. It is believed that most of the orthographic  or typographic  errors lead to
          strings that are impracticable as separate words in this  language. Identifying the errors that
          exchange by accident one word into another obtainable word, like English ‘then’ into ‘than’ or
          Spanish ‘czar’ into ‘Caesar’, considers a task which needs much more influential tools.




                                           LOVELY PROFESSIONAL UNIVERSITY                                   153
   154   155   156   157   158   159   160   161   162   163   164