Page 237 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 237

Unit 12: Natural Language Processing




          Structural pattern recognition, sometimes referred to as syntactic pattern recognition due to its  Notes
          origins in formal language theory, relies on syntactic grammars to discriminate among data
          from different groups based upon the morphological interrelationships (or interconnections)
          present within the data. Structural features, often referred to as primitives, represent the
          subpatterns (or building blocks) and the relationships among them which constitute the data.
          The semantics associated with each feature are determined by the coding scheme (i.e., the selection
          of morphologies) used to identify primitives in the data. Feature vectors generated by structural
          pattern recognition systems contain a variable number of features (one for each primitive
          extracted from the data) in order to accommodate the presence of superfluous structures which
          have no impact on classification. Since the interrelationships among the extracted primitives
          must also be encoded, the feature vector must either include additional features describing the
          relationships among primitives or take an alternate form, such as a relational graph, that can be
          parsed by a syntactic grammar.
          The emphasis on relationships within data makes a structural approach to pattern recognition
          most sensible for data which contain an inherent, identifiable organization such as image data
          (which is organized by location within a visual rendering) and time-series data (which is
          organized by time); data composed of independent samples of quantitative measurements, lack
          ordering and require a statistical approach. Methodologies used to extract structural features
          from image data such as morphological image processing techniques result in primitives such
          as edges, curves, and regions; feature extraction techniques for time-series data include chain
          codes, piecewise linear regression, and curve fitting which are used to generate primitives that
          encode sequential, time-ordered relationships. The classification task arrives at an identification
          using parsing: the extracted structural features are identified as being representative of a particular
          group if they can be successfully parsed by a syntactic grammar. When discriminating among
          more than two groups, a syntactic grammar is necessary for each group and the classifier must
          be extended with an adjudication scheme so as to resolve multiple successful parsings.




              Task  Analyze the approaches for implementing a pattern recognition system.

          Self Assessment


          State whether the following statements are true or false:
          13.  Generalization and memory are part of natural language understanding.
          14.  The natural language processing component of this system consists of four modules that
               share the same grammar.

          15.  Structural pattern recognition sometimes referred to as syntactic pattern recognition.

          12.6 Summary

               Natural Language Processing (NLP) is the computerized approach to analyzing text that is
               based on both a set of theories and a set of technologies. And, being a very active area of
               research and development.
               Modern NLP algorithms are grounded in machine learning, especially statistical machine
               learning.
               Computational linguistics might be considered as a synonym of automatic processing of
               natural language, since the main task of computational linguistics is just the construction
               of computer programs to process words and texts in natural language.



                                           LOVELY PROFESSIONAL UNIVERSITY                                   231
   232   233   234   235   236   237   238   239   240   241   242