Page 237 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 237
Unit 12: Natural Language Processing
Structural pattern recognition, sometimes referred to as syntactic pattern recognition due to its Notes
origins in formal language theory, relies on syntactic grammars to discriminate among data
from different groups based upon the morphological interrelationships (or interconnections)
present within the data. Structural features, often referred to as primitives, represent the
subpatterns (or building blocks) and the relationships among them which constitute the data.
The semantics associated with each feature are determined by the coding scheme (i.e., the selection
of morphologies) used to identify primitives in the data. Feature vectors generated by structural
pattern recognition systems contain a variable number of features (one for each primitive
extracted from the data) in order to accommodate the presence of superfluous structures which
have no impact on classification. Since the interrelationships among the extracted primitives
must also be encoded, the feature vector must either include additional features describing the
relationships among primitives or take an alternate form, such as a relational graph, that can be
parsed by a syntactic grammar.
The emphasis on relationships within data makes a structural approach to pattern recognition
most sensible for data which contain an inherent, identifiable organization such as image data
(which is organized by location within a visual rendering) and time-series data (which is
organized by time); data composed of independent samples of quantitative measurements, lack
ordering and require a statistical approach. Methodologies used to extract structural features
from image data such as morphological image processing techniques result in primitives such
as edges, curves, and regions; feature extraction techniques for time-series data include chain
codes, piecewise linear regression, and curve fitting which are used to generate primitives that
encode sequential, time-ordered relationships. The classification task arrives at an identification
using parsing: the extracted structural features are identified as being representative of a particular
group if they can be successfully parsed by a syntactic grammar. When discriminating among
more than two groups, a syntactic grammar is necessary for each group and the classifier must
be extended with an adjudication scheme so as to resolve multiple successful parsings.
Task Analyze the approaches for implementing a pattern recognition system.
Self Assessment
State whether the following statements are true or false:
13. Generalization and memory are part of natural language understanding.
14. The natural language processing component of this system consists of four modules that
share the same grammar.
15. Structural pattern recognition sometimes referred to as syntactic pattern recognition.
12.6 Summary
Natural Language Processing (NLP) is the computerized approach to analyzing text that is
based on both a set of theories and a set of technologies. And, being a very active area of
research and development.
Modern NLP algorithms are grounded in machine learning, especially statistical machine
learning.
Computational linguistics might be considered as a synonym of automatic processing of
natural language, since the main task of computational linguistics is just the construction
of computer programs to process words and texts in natural language.
LOVELY PROFESSIONAL UNIVERSITY 231