Page 222 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 222

Introduction to Artificial Intelligence & Expert Systems




                    Notes          human or natural language input. Up to the 1980s, most NLP systems were based on complex
                                   sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in NLP
                                   with the introduction of learning algorithms for language processing. This was due both to the
                                   steady increase in computational power resulting from Moore’s Law and the gradual lessening
                                   of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose
                                   theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-
                                   learning approach to language processing. Some of the earliest-used machine learning algorithms,
                                   such as decision trees, produced systems of hard if-then rules similar to existing hand-written
                                   rules. Increasingly, however, research has focused on statistical models, which make soft,
                                   probabilistic decisions based on attaching real-valued weights to the features making up the
                                   input data. The caches language models upon which many speech recognition systems now rely
                                   are examples of such statistical models. Such models are generally more robust when given
                                   unfamiliar input, especially input that contains errors (as is very common for real-world data),
                                   and produce more reliable results when integrated into a larger system comprising multiple
                                   subtasks.
                                   Many of the notable early successes occurred in the field of machine translation, due especially
                                   to work at IBM Research, where successively more complicated statistical models were developed.
                                   These systems were able to take advantage of existing multilingual textual corpora that had
                                   been produced by the Parliament of Canada and the European Union as a result of laws calling
                                   for the translation of all governmental proceedings into all official languages of the
                                   corresponding systems of government. However, most other systems depended on corpora
                                   specifically developed for the tasks implemented by these systems, which was (and often continues
                                   to be) a major limitation in the success of these systems. As a result, a great deal of research has
                                   gone into methods of more effectively learning from limited amounts of data.
                                   Recent research has increasingly focused on unsupervised and semi-supervised learning
                                   algorithms. Such algorithms are able to learn from data that has not been hand-annotated with
                                   the desired answers, or using a combination of annotated and non-annotated data. Generally,
                                   this task is much more difficult than supervised learning, and typically produces less accurate
                                   results for a given amount of input data. However, there is an enormous amount of non-annotated
                                   data available (including, among other things, the entire content of the World Wide Web),
                                   which can often make up for the inferior results.

                                   12.1 Overview of Linguistics

                                   Linguistics is the study of human languages. It follows scientific approach. So it is also referred
                                   to as linguistic science. Linguistics deals with describing and explaining the nature of human
                                   languages. It treats language and the ways people use it as phenomena to be studied. Linguist is
                                   one who is expertise in linguistics. Linguist studies the general principles of language
                                   organization and language behavior. Linguistic analysis concerns with identifying the structural
                                   units and classes of language. Linguists also attempt to describe how smaller units can be
                                   combined to form larger grammatical units such as how words can be combined to form phrases,
                                   phrases can be combined to form clauses, and so on. They also concerns what constrains the
                                   possible meanings for a sentence. Linguists use intuitions about well-formedness and meaning
                                   and mathematical models of structure such as formal language theory and model theoretic
                                   semantics. Structure of language include morphemes, words, phrases, and grammatical classes.
                                   Sub-fields with respect to linguistic structure are phonetics, phonology, morphology, syntax,
                                   semantics, pragmatics, and discourse analysis. There are many branches of linguistics including
                                   applied linguistics, computational linguistics, evolutionary linguistics, neurolinguistics,
                                   cognitive linguistics and psycholinguistics. A linguist in the academic sense is a person who
                                   studies natural language (an academic discipline known as linguistics).





          216                               LOVELY PROFESSIONAL UNIVERSITY
   217   218   219   220   221   222   223   224   225   226   227