Page 133 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 133

Unit 7: Probabilistic Reasoning




          Introduction                                                                          Notes

          In statistics, Bayesian inference is a method of inference in which Bays’’ rule is used to update the
          probability estimate for a hypothesis as additional evidence is learned. Bayesian updating is an
          important technique throughout statistics, and especially in mathematical statistics. For some
          cases, exhibiting a Bayesian derivation for a statistical method automatically ensures that the
          method works as well as any competing method. Bayesian updating is especially important in
          the dynamic analysis of a sequence of data. Bayesian inference has found application in a range
          of fields including science, engineering, philosophy, medicine, and law.
          In the philosophy of decision theory, Bayesian inference is closely related to discussions of
          subjective probability, often called “Bayesian probability”. Bayesian probability provides a
          rational method for updating beliefs; however, non-Bayesian updating rules are compatible
          with rationality, according to philosophers Ian Hacking and Bas van Fraassen.

          7.1 Bayesian Probabilistic Inference

          Bayesian inference derives the posterior probability as a consequence of two antecedents, a
          prior probability and a “likelihood function” derived from a probability model for the data to
          be observed. Bayesian inference computes the posterior probability according to Bayes’ rule:
                           PE  H  ⋅  ( )
                                    H
                            (| ) P
                  P(H|E)=
                              PE
                               ()
          where
               | means given.
               H stands for any hypothesis whose probability may be affected by data (called evidence
               below). Often there are competing hypotheses, from which one chooses the most probable.

               the evidence E corresponds to data that were not used in computing the prior probability.
               P(H), the prior probability, is the probability of H before E is observed. This indicates
               one’s preconceived beliefs about how likely different hypotheses are, absent evidence
               regarding the instance under study.
               P(H|E), the posterior probability, is the probability of H given E, i.e., after E is observed.
               This tells us what we want to know: the probability of a hypothesis given the observed
               evidence.

               P(E|H), the probability of observing E given H, is also known as the likelihood. It indicates
               the compatibility of the evidence with the given hypothesis.
               P(E) is sometimes termed the marginal likelihood or “model evidence”. This factor is the
               same for all possible hypotheses being considered. (This can be seen by the fact that the
               hypothesis H does not appear anywhere in the symbol, unlike for all the other factors.)
               This means that this factor does not enter into determining the relative probabilities of
               different hypotheses.
          Note that what affects the value of P(H|E) for different values of H is only the factors P(H) and
          P(E|H), which both appear in the numerator, and hence the posterior probability is proportional
          to both. In words:

               (more exactly) The posterior probability of a hypothesis is determined by a combination
               of the inherent likeliness of a hypothesis (the prior) and the compatibility of the observed
               evidence with the hypothesis (the likelihood).




                                           LOVELY PROFESSIONAL UNIVERSITY                                   127
   128   129   130   131   132   133   134   135   136   137   138