Page 134 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 134

Introduction to Artificial Intelligence & Expert Systems




                    Notes              (more concisely) Posterior is proportional to prior times likelihood.
                                   Note that Bayes’ rule can also be written as follows:
                                                   PE H
                                                    (| )
                                                           ()
                                           P(H|E)=        ⋅  PH
                                                     ()
                                                    PE
                                                 PE  H
                                                  (| )
                                   where the factor     represents the impact of E on the probability of H.
                                                  PE
                                                   ()
                                   Informal, Rationally, Bayes’ rule makes a great deal of sense. If the evidence does not match up
                                   with a hypothesis, one should reject the hypothesis. But if a hypothesis is extremely unlikely a
                                   priori, one should also reject it, even if the evidence does appear to match up.
                                          Example: Imagine that I have various hypotheses about the nature of a newborn baby of
                                   a friend, including:
                                       H  : the baby is a brown-haired boy.
                                         1
                                       H  : the baby is a blond-haired girl.
                                         2
                                       H  : the baby is a dog.
                                         3
                                   Then consider two scenarios:
                                   I’m presented with evidence in the form of a picture of a blond-haired baby girl. I find this
                                   evidence supports H  and opposes H  and H .
                                                   2            1     3
                                   I’m presented with evidence in the form of a picture of a baby dog. Although this evidence,
                                   treated in isolation, supports H , my prior belief in this hypothesis (that a human can give birth
                                                           3
                                   to a dog) is extremely small, so the posterior probability is nevertheless small.
                                   The critical point about Bayesian inference, then, is that it provides a principled way of combining
                                   new evidence with prior beliefs, through the application of Bayes’ rule. (Contrast this with
                                   frequentist inference, which relies only on the evidence as a whole, with no reference to prior
                                   beliefs.) Furthermore, Bayes’ rule can be applied iteratively: after observing some evidence, the
                                   resulting posterior probability can then be treated as a prior probability, and a new posterior
                                   probability computed from new evidence. This allows for Bayesian principles to be applied to
                                   various kinds of evidence, whether viewed all at once or over time. This procedure is termed
                                   “Bayesian updating”.
                                   Bayesian Updating


                                   Bayesian updating is widely used and computationally convenient. However, it is not the only
                                   updating rule that might be considered “rational”.
                                   Ian Hacking noted that traditional “Dutch book” arguments did not specify Bayesian updating:
                                   they left open the possibility that non-Bayesian updating rules could avoid Dutch books. Hacking
                                   wrote “And neither the Dutch book argument, nor any other in the personalist arsenal of proofs
                                   of the probability axioms, entails the dynamic assumption. Not one entails Bayesianism. So the
                                   personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a
                                   personalist could abandon the Bayesian model of learning from experience. Salt could lose its
                                   savour.”



                                     Did u know? There are non-Bayesian updating rules that also avoid Dutch books. The
                                     additional hypotheses needed to uniquely require Bayesian updating have been deemed
                                     to be substantial, complicated, and unsatisfactory.




          128                               LOVELY PROFESSIONAL UNIVERSITY
   129   130   131   132   133   134   135   136   137   138   139