Page 222 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 222
Introduction to Artificial Intelligence & Expert Systems
Notes human or natural language input. Up to the 1980s, most NLP systems were based on complex
sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in NLP
with the introduction of learning algorithms for language processing. This was due both to the
steady increase in computational power resulting from Moore’s Law and the gradual lessening
of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose
theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-
learning approach to language processing. Some of the earliest-used machine learning algorithms,
such as decision trees, produced systems of hard if-then rules similar to existing hand-written
rules. Increasingly, however, research has focused on statistical models, which make soft,
probabilistic decisions based on attaching real-valued weights to the features making up the
input data. The caches language models upon which many speech recognition systems now rely
are examples of such statistical models. Such models are generally more robust when given
unfamiliar input, especially input that contains errors (as is very common for real-world data),
and produce more reliable results when integrated into a larger system comprising multiple
subtasks.
Many of the notable early successes occurred in the field of machine translation, due especially
to work at IBM Research, where successively more complicated statistical models were developed.
These systems were able to take advantage of existing multilingual textual corpora that had
been produced by the Parliament of Canada and the European Union as a result of laws calling
for the translation of all governmental proceedings into all official languages of the
corresponding systems of government. However, most other systems depended on corpora
specifically developed for the tasks implemented by these systems, which was (and often continues
to be) a major limitation in the success of these systems. As a result, a great deal of research has
gone into methods of more effectively learning from limited amounts of data.
Recent research has increasingly focused on unsupervised and semi-supervised learning
algorithms. Such algorithms are able to learn from data that has not been hand-annotated with
the desired answers, or using a combination of annotated and non-annotated data. Generally,
this task is much more difficult than supervised learning, and typically produces less accurate
results for a given amount of input data. However, there is an enormous amount of non-annotated
data available (including, among other things, the entire content of the World Wide Web),
which can often make up for the inferior results.
12.1 Overview of Linguistics
Linguistics is the study of human languages. It follows scientific approach. So it is also referred
to as linguistic science. Linguistics deals with describing and explaining the nature of human
languages. It treats language and the ways people use it as phenomena to be studied. Linguist is
one who is expertise in linguistics. Linguist studies the general principles of language
organization and language behavior. Linguistic analysis concerns with identifying the structural
units and classes of language. Linguists also attempt to describe how smaller units can be
combined to form larger grammatical units such as how words can be combined to form phrases,
phrases can be combined to form clauses, and so on. They also concerns what constrains the
possible meanings for a sentence. Linguists use intuitions about well-formedness and meaning
and mathematical models of structure such as formal language theory and model theoretic
semantics. Structure of language include morphemes, words, phrases, and grammatical classes.
Sub-fields with respect to linguistic structure are phonetics, phonology, morphology, syntax,
semantics, pragmatics, and discourse analysis. There are many branches of linguistics including
applied linguistics, computational linguistics, evolutionary linguistics, neurolinguistics,
cognitive linguistics and psycholinguistics. A linguist in the academic sense is a person who
studies natural language (an academic discipline known as linguistics).
216 LOVELY PROFESSIONAL UNIVERSITY