Page 147 - DCAP506_ARTIFICIAL_INTELLIGENCE
P. 147

Unit 11: Natural Language Processing




          evaluation of dependency parsers were performed in the context of the CoNLL shared tasks in  Notes
          2006 and 2007. In Italy, the evalita campaign was conducted in 2007 to compare various tools for
          Italian evalita web site. In France, within the ANR-Passage project (end of 2007), 10 parsers for
          French were compared passage web site. Adda G., Mariani J., Paroubek P., Rajman M. 1999
          L’action GRACE d’évaluation de l’assignation des parties du discours pour le français. Langues
          vol-2 Black E., Abney S., Flickinger D., Gdaniec C., Grishman R., Harrison P., Hindle D., Ingria
          R., Jelinek F., Klavans J., Liberman M., Marcus M., Reukos S., Santoni B., Strzalkowski T. 1991 A
          procedure for quantitatively comparing the syntactic coverage of English grammars. DARPA
          Speech and Natural Language Workshop Hirshman L. 1998 Language understanding evaluation:
          lessons learned from MUC and ATIS. LREC Granada Pallet D.S. 1998 The NIST role in automatic
          speech recognition benchmark tests.

          11.1.2 Tasks and Limitations of NLP

          In theory, natural-language processing is a very attractive method of human-computer interaction.
          Early  systems  such  as SHRDLU,  working  in  restricted  “blocks  worlds”  with  restricted
          vocabularies, worked extremely well, leading researchers  to excessive optimism, which was
          soon lost when the systems were extended to more realistic situations with real-world ambiguity
          and complexity. Natural-language understanding is sometimes referred to as an AI-complete
          problem, because natural-language recognition seems to require  extensive knowledge  about
          the outside world and the ability to manipulate it. The definition of “understanding” is one of
          the major problems in natural-language processing.

          11.1.3 Sub-problems of NLP


          Speech Segmentation

          In most spoken languages, the sounds representing successive letters blend into each other, so
          the conversion of the analog signal to discrete characters can be a very difficult process. Also, in
          natural speech there are hardly any pauses between successive words; the location of  those
          boundaries usually must take into account grammatical and semantic constraints, as well as the
          context.

          Text Segmentation

          Some written languages like Chinese, Japanese and Thai do not have single-word boundaries
          either, so any significant text parsing usually requires the identification of word boundaries,
          which is often a non-trivial task.

          Part-of-speech Tagging

          Word sense disambiguation: Many words have more than one meaning; we have to select the
          meaning which makes the most sense in context.

          Syntactic Ambiguity

          The grammar for natural languages is ambiguous, i.e. there are often multiple possible parse
          trees for a given sentence. Choosing the most appropriate one usually requires semantic and
          contextual information. Specific problem components of syntactic ambiguity include sentence
          boundary disambiguation.






                                           LOVELY PROFESSIONAL UNIVERSITY                                   141
   142   143   144   145   146   147   148   149   150   151   152